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1. Introduction 

The Monte Carlo method based on a Markov process 
has been quite a powerful tool of the model analysis in 
many-body physics such as condensed matter physics, 
statistical physics and field theory, fn the present re- 
view, we focus on a branch of Markov-chain Monte Carlo 
methods that have been developed remarkably during 
the past decade, i.e., the quantum Monte Carlo^ that 
samples from an ensemble of world-line configurations 
in the path-integral representation of the partition func- 
tion. The methodological advancement is largely due to 
the global update of the world-line configurations. The 
breakthrough was made by Evertz, Lana and Marcu,^ 
who proposed a new algorithm, called a loop algorithm, 
for the s — \/2 XXZ model, and later also by Prokof'ev, 
Svistunov and Tupitsyn,'^ whose approach, called the 
worm algorithm, seemed quite different at first sight. 
In a loop algorithm, the world-line configuration is up- 
dated in the unit of loops in the space-time formed by a 
stochastic procedure. It turned out that the loop update 
does not only reduce the critical slowing-down, but it 
also removes several other drawbacks of the conventional 
quantum Monte Carlo. In the worm algorithm,^ on the 
other hand, the world-line configuration is updated by 
the movements of a worm, i.e., a pair of artificial sin- 
gular points at which world-lines are discontinuous. A 
framework was proposed'' recently that unifies these two 
ways of updating and enjoys the virtues of both. In the 
present article, therefore, we focus on three important 
algorithms (or, to be more precise, three frameworks for 
algorithms); the loop algorithm,^ the worm algorithm,'^ 
and the directed-loop algorithm.^ In some special cases 
two of them are identical, e.g., the directed-loop algo- 
rithm applied to the s = 1/2 antiferromagnetic Heisen- 
berg model is nothing but a single-cluster version of the 
loop algorithm. 

Before the proposal of these new algorithms, simula- 
tions had been done with local updating rule on the dis- 
cretized imaginary time. The local updating rule is anal- 
ogous to the single-spin-flip Metropolis algorithm of the 
Ising model. While it provided the first systematic means 
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of numerical study of systems at finite temperatures, it 
had a number of drawbacks; (i) the critical slowing-down, 
(ii) the fine-mesh slowing-down (i.e., the slowing-down 
when the discretization step of the imaginary time is de- 
creased), (iii) non-ergodicity (the temporal and the spa- 
tial winding numbers are conserved), (iv) the discretiza- 
tion error, and (v) difficulty in measuring the off-diagonal 
quantities, (vi) the negative-sign problem. These draw- 
backs have been removed (or at least reduced) by the re- 
cent development of the quantum Monte Carlo method 
mentioned above. The critical slowing-down and the fine- 
mesh slowing-down have been reduced to the negligible 
level in most applications.^"^ The non-ergodicity and the 
discretization error have been completely removed.^ In 
addition, most of the off-diagonal quantities of interest 
can be measured.'^' The negative-sign problem is the 
toughest and only very limited solution is available. How- 
ever, there is at least a few cases where this difficulty can 
be overcome by the loop algorithm.'"''-' 

There are a number of articles already published on 
the quantum Monte Carlo. We here only refer the reader 
to a review article'^ for the achievement made before 
the loop algorithm was proposed. For the loop algorithm 
and related algorithms, an excellent overview'^ has been 
written on the loop algorithm by one of the founders of 
the algorithm. Still there remains a lot of technical diffi- 
culties for an unfamiliar reader to start simulations from 
scratch. Therefore, we feel it useful to put various tech- 
nical details together with the background mathematics. 
The purpose of the present article is, therefore, to present 
various ingredients in a single article in a form compre- 
hensible to non-specialists and ready to use for practi- 
tioners. In what follows we describe how we perform the 
simulation in detail and take a particular care in making 
the article practical. On the other hand, we do not intend 
to make the present article to be comprehensive; we men- 
tion applications only when it is necessary for illustrating 
a new idea and show how effective it is. As a result, only 
a few applications are discussed in the rest of the arti- 
cle. We refer the readers who are interested in various 
applications, as well as other things that we omit in the 
present article, to the review articles mentioned above. 

To make this article usable, we separate how from why, 
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1. e., the description of the algorithms from their math- 
ematical derivations. Therefore, those who need a quick 
start, rather than knowing why an algorithm gives cor- 
rect results, may skip the theoretical part (§ 2) and im- 
mediately go to §3 entitled "Numerical Recipe". This 
section is almost self-contained and only the minimal ref- 
erences are made to the other parts of the article. 

2. Theory 

In this section, we present a few algorithms, roughly in 
chronological order. Descriptions will constitute a math- 
ematical justification of the algorithms' validity, though 
they do not follow the conventional theory-and-proof for- 
mat. Some examples are also presented for illustrating 
relevant ideas and the efficiency of the resulting algo- 
rithms. While this style would make it easier to follow 
the logic that establishes the validity of the algorithms, 
it may make it hard to find the precise and detailed defi- 
nitions of the procedures. Therefore, wc; add another sec- 
tion following the present one, in which we concentrate 
on describing the procedures precisely. 

2.1 Cluster Update 

The improvements accomplished on the quantum 
Monte Carlo simulation during the last decade was 
largely due to the global update, in which configurations 
are updated in units of some non-local clusters. Such a 
method of updating is inspired by the Swendsen-Wang 
(SW) algorithm^'' for the Ising model. In fact, it is not 
merely inspired but has the same mathematical back- 
ground as the SW algorithm. This is manifested by the 
fact that the loop algorithm proposed by Evertz et al. for 
the 8 = 1/2 XX Z quantum spin model depends contin- 
uously on the anisotropy and in the limit of a large uni- 
axial anisotropy (i.e., the Ising limit), the algorithm con- 
verges to something equivalent to the SW algorithm. In 
this sense, the; loop algorithm for the quantum spin sys- 
tems can be considered as a generalization of the SW al- 
gorithm. The same is true for the single-cluster variant of 
the cluster algorithm by Wolff; the single-cluster vari- 
ant of the loop algorithm for the quantum Monte Carlo 
can be derived from the Wolff algorithm in exactly the 
same way as we can derive its rnultipk;-cluster variant 
from the SW algorithm. In what follows, we consider the 
multiple-cluster variant, when we have to choose one, 
while the generalization to the single-cluster variant is 
straightforward in many cases. 

We start with describing the SW algorithm to clarify 
the mathematical basis underlying almost all the algo- 
rithms discussed in the present article. Simply stated, 
the SW algorithm and other algorithms presented below 
are special cases of the dual Monte Carlo algorithm. 
In a dual Monte Carlo algorithm, the Markov process al- 
ternates between two configuration spaces; the space of 
the original configurations that naturally arise from the 
model (such as the spin configurations in the Ising model 
and the world-line configurations in the quantum lattice 
models) and the space of the configurations of auxiliary 
variables. It is up to us to define the auxiliary variables 
and the resulting algorithm depends on the definition. In 
what follows, we denote the original configuration by S 
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Fig. 1. A schematic process of a dual Monte Carlo. The arrow 
indicates dependencies. It should be noted that the new state 
S{n + 1) depends on the previous one Sin) only through G(n). 
The same is true for G{n + 1). 

and the auxiliary one by G. (We denote the size of spins 

by s instead of S, to avoid confusion.) Once the auxiliary 
variables are defined, a stochastic process is characterized 
by the transition probabilities T{G\S) and T(5|G), the 
former being of generating G with S given and the latter 
of generating S with G given. The stochastic process as 
depicted in Fig. 1 yields the limiting distribution 

lim Pn{S) = hm Prob(6'(n) = 5) oc W{S) 

n— *oo n— >oo 

provided that we define the transition probabilities so 
that the ergodicity and the extended detailed balance 

T{G\S)W{S) = T{S\G)W{G) (2.1) 

may hold. Here W{S) {W{G)) is an arbitrary positive 
function of S (G). It is specified for each individual case 
as we see below. 

Swendsen and Wang chose the auxiliary variable G„ 
to be a one-bit (i.e., 0-or-l) variable defined on each 
pair u of interacting spins. The auxiliary configura- 
tion G is defined as the set of all such variables: G = 
(Gi , G2 , • • • . G,/Vb ) , where A''^ is the total number of the 
nearest-neighbor pairs. It is very cumbersome to describe 
the procedure in terms only of variables, with no picture, 
though in the end such a description is needed for coding. 
We, therefore, resort to visual means whenever a suitable 
visualization is available. The local unit u on which we 
define an auxiliary variable G„ is not necessarily a pair 
of sites. In addition, G„ is not necessarily a 0/1 variable 
either. Therefore, the visualization varies depending on 
the problem. In the case of the Ising model, however, the 
visualization is done most naturally by representing an 
up-spin and a down-spin by an open and a solid circle, 
respectively, and an auxiliary variable by the presence 
(corresponding to G„ = 1) or the absence (correspond- 
ing to Gu = 0) of a solid line connecting two neighboring 
circles. (See Fig. 2.) 

Swendsen and Wang^^ proposed the following proce- 
dure of updating the spin configuration. For a given con- 
figuration, (i) assign G„ = or 1 probabilistically to each 
u, (ii) identify connected spins to form clusters, and for 
each cluster (iii) assign a common value ±1 to all the 
spin variables on it. In the graphical terms, this yields 
the following (i) connect nearest-neighbor circles with 
some probability, (ii) recognize the connected sets of cir- 
cles, and for each connected set, (iii) change the color of 
all circles simultaneously with a certain probability. The 
step (iii) is often called a 'cluster-fiip'. 

In the following, we show that this stochastic process 
produces the distribution of spin configuration propor- 
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tional to the Boltzmann weight 



WiS) = exp KJ2S^S, = l[w{Su), (2.2) 

V (y) / 

where u = Su = {St,Sj), and w{Su) = 

exp{KSiSj). Following Fortuin and Kasteleyn/^' ^'^ we 
decompose the local Boltzmann weight as 



The function w{Su,Gu) is defined as 



(2.3) 



w{Su,Gu) 



-K 







(G„ = 0) 
-K (S, = Sj and Gu = 1) 
{Si ^ Sj and G„ = 1) 



(2.4) 

It is easy to verify that this definition satisfies (2.3). Us- 
ing (2.3), eq. (2.2) can be formally rewritten as 



where 



WiS) = ^W{S,G), 



W{S,G) = l[w{Su,Gu). 



(2.5) 



(2.6) 



Once the target weight W{S) is written in the form of 
(2.5) with (2.6) and (2.3), we can in general satisfy the 
detailed balance (2.1) by defining the transition proba- 
bilities as 

W{S, G) 



T{G\S) = -^^^Jl^, T{S\G) 



(2.7) 



W{S) ' w{G) 

where W{G) = J2s ^C-^' ^ stated above, a Markov 
process with these transition probabilities yields the tar- 
get distribution W{S). 

For the graph-assignment probability T{G\S), we 
can rewrite (2.7) using (2.2) and (2.3) as T{G\S) = 
UutiGu\Su) with t{Gu\Su) being 



t{Gu\Su) = w{Su: Gu)/w{Su). 



(2.8) 



Since T{G\S) is factorized into the local factors t{Gu\Su), 
the graph assignment can be done locally; we can assign a 
graph element to each local unit independently with the 
probability t{Gu\Su)- For the Ising model, in particular, 
this transition probability is realized by the well-known 
Swendsen-Wang procedure, i.e., connecting each nearest- 
neighbor pair of parallel spins with the probability 1 — 



-2K 



and leaving them unconnected otherwise. 



For the spin-updating probability T{S\G), we similarly 

obtain 



T(^|G) = 2-^^(<=)A(^,G), 



(2.9) 



where Nc{G) is the number of connected clusters in G 
and A(S, G) is the function that takes a value 1 if and 
only if all spins in each cluster are aligned in the same 
direction in S. Therefore, this transition probability is 
realized by the step (iii) in Swendsen and Wang's proce- 
dure. Thus the validity of the procedure has been proved. 




Fig. 2. A spin configuration S and a matching graph G of the 
Ising model. The spin configuration is represented by the open 
and soUd circles whereas the graph consists of the lines connect- 
ing circles. 



2.2 Path Integral and Quantum Monte Carlo 

The description of the SW algorithm given in the pre- 
vious subsection is quite general; the only model-specific 
part is the first equality in (2.2) and (2.4). In fact, the 
loop algorithm for the quantum Monte Carlo can be re- 
garded as a special case of this framework. As long as the 
target weight W{S) can be expressed as (2.5) with some 
W{S,G), the transition probabilities (2.7) constitute a 
valid algorithm (provided, of course, that the ergodicity 
holds), regardless of the model we consider. Therefore, 
the only that we have to do is to specify ingredients such 
as 5, G, and W(S, G), which we do in this subsection. 

There are two ways of introducing S; one by the path 
integral^ and the other by the high-temperature series 
expansion. While the latter leads to a discrete-time 
algorithm with an exponentially small systematic er- 
ror, the former is simpler to describe. In addition, both 
the representations reduce to the same algorithm in the 
continuous-time limit. Therefore, we describe the frame- 
work starting from the path-integral representation. The 
formulation based on the high-temperature series expan- 
sion is discussed in §2.4. 

For the derivation of the algorithm presented below, 
it is often useful to consider the path integral in the dis- 
cretized imaginary time (though the discretization is not 
needed in the final algorithm). Such an expression can 
be obtained as follows. First we consider the identity. 




fe=i 



L 



(2.10) 



In particular, when the Hamiltonian is a sum of M terms, 

M 

n = J2'Hb, (2.11) 



then, (2.10) leads to 



6=1 



'I- nn ('-!«') 



Here, the summation is taken over some complete or- 
thonormal basis {ip} of the Hilbert space. Inserting 1 = 
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Fig. 3. A visualization of the path integral of a five-site system 
with discrete imaginary time. The world-Unes are represented by 
two thick gray lines. 

J2ip IV') (^1 between two adjacent factors, we obtain 

L M 

^ = ^'Zo ^ \{\{{'4>b+i{k)\{l-{I^T)m)\Mk)), 

^°°{V'!>(fe)}fe=16=l 

(2.12) 

where Ar = P/L. For simplifying the notation, we de- 
note (6, k) as u, ijjbik) as tp^, V'6+i(A:) as tp'^, Hb as H„, 
and {ipb{k)} as S. It follows that 

Z = lim y"WL{S), (2.13) 

L—>oo ^— ' 
S 

Wl{S) = llwiSu), 

U 

w{Su) = (V';|(1-(At)H„)|V«). (2.14) 

Thus the partition function is expressed as the sum of a 
weight that is a function of a space-time configuration. In 

the case of the particle systems, with the basis that diag- 
onalizes the local number operators, the space-time con- 
figuration is called a world-line configuration, since the 
configuration is visualized by trajectories of particles in 
the space-time. The configuration for spin models is also 
called the world-line configuration by regarding up-spins 
and down-spins as particles and holes respectively. An 
example of the world-line configuration is shown in Fig. 
3. The whole system consists of L layers whereas each 
layer contains M sub-layers. (The number L is called 
the Trotter number.) The "height" of each layer is Ar 
and the height of the whole system is always (3 regardless 
of L. Every TYb has its representative in each layer, i.e., a 
unit u called a plaquette. Each of M sub-layers contains 
a plaquette. 

In early world-line Monte Carlo algorithms, updates 
of a configuration were done in many steps,^'^^'^^ each 
being a local update that modifies only a small part of 
the system. Before the loop algorithm, the unit of the lo- 
cal update was a square whose spatial dimension equals 
the lattice spacing and the temporal dimension the dis- 
cretization unit of time. The square is shown in Fig. 4 
together with the world-line configurations before and af- 
ter the update. Because of the local nature of the updat- 
ing unit, the algorithm exhibits a severe slowing-down. It 
happens when we approach a critical point or zero tem- 



Fig. 4. One step in the local update algorithm. The squares such 

as the one drawn with dashed lines are the units of the update. 
At every step, one of the squares is chosen at random. The flip 
of the chosen square is accepted with a probability that depends 
on the weights of the configurations before and after the flip. 

perature. This can be intuitively understood as the dis- 
crepancy of the physical correlation length and the spa- 
tial scale of the updating imit. Another slowing-down, 
pointed out by Wiesler,^^ when the temporal scale of 
the system (i.e., the inverse evergy gap) largely differs 
from the temporal scale of the updating unit. The situ- 
ation occurs when one decreases the discretization unit 
of the imaginary time in order to reduce the system- 
atic error due to the discretization. It was proposed^^ 
that this slowing-down can be removed by applying the 
loop method only to the temperal direction . The algo- 
rithms discussed below solve both types of slowing-down 
in many cases of interest. 

2.3 Loop Update 

A loop algorithm for a quantum system can be con- 
structed in a similar fashion as the Swendsen-Wang al- 
gorithm mentioned in §2.1. That is, by introducing ad- 
ditional variables G„ = G{b,k) = 0,1, we can rewrite 
Wl{S) in (2.14) as 

wds)^Y{ J2 (^;|(-(At)h.)«"|v„) 

u G„=0,1 

= Y: (Ar)"^^) n « I I (2-15) 

G u 

= Y,WLiS,G), (2.16) 

G 

where G = {G„}, n{G) = J^u '^u, and 

Wl{S,G) = l[w{Su,Gu), (2.17) 

U 

w{S^,G^) = (V;|(-(Ar)W„)«"|^„). (2.18) 

Since these expressions (2.16) and (2.17) have the same 
form as (2.5) and (2.6), we can apply the prescription 
presented in § 2.1 for defining the transition probabilities 
T{G\S) and T{S\G) through (2.7), thereby constructing 
an algorithm of simulating a target distribution Wl{S). 
Since Wl{S) is an approximation to W{S), such an al- 
gorithm can be used as an 'approximate' algorithm of 
simulating the distribution W{S). (In §2.5, we see that 
this 'approximation' can be made exact.) 

In order to complete the definition of the algorithm, 
we have to specify the Hamiltonian, what orthonormal 
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set we use, and how we decompose it in (2.11). In what 
foUows, we do these and examine what procedure corre- 
sponds to the resulting transition probabihties. 

First of all, wc specify the meaning of the decom- 
position (2.11) of the Hamiltonian. We start with the 
graphical decomposition^' ^^"^^ of the pair Hamilto- 
nian Hij : 

Uij =Y.nij{9), Hij{g) = -a{g)Aij{g), (2.19) 



where g specifies a type of a graph element, a{g) is some 
positive constant, and \j(g) is an operator whose ma- 
trix elements are or 1. As shown in Table I, a A- 
opcrator corresponds to a graph element with two types 
of lines, each representing a condition for making the ma- 
trix element 1. A solid line connecting two spins repre- 
sents the condition that the two must be parallel whereas 
a dashed line requires that the two be anti-parallel. 

In the case of the s = 1/2 anti- ferromagnetic Heisen- 
berg model, 

Hij = -J{SfS^ + S^S] - {SfS] - 1/4)), 

which we use in the following as an example, the sum- 
mation in eq. (2.19) contains only one term: 



(2.20) 



where 9h is the graph clement shown in the third graph 
from the top in Table I. Here, for the orthonormal com- 
plete set ({V'} in (2-10)), we have chosen the set of the 
simultaneous cigcnstatcs of the z-componcnts of all spin 
operators, as we do in most of the present article. The 
operator Aij{g-B) is explicitly defined in terms of the ma- 
trix elements as 

(2.21) 



Equation (2.19) results in the decomposition of the 
total Hamiltonian as 7-^ = ^Yli{ij)^Yl,g'^iiis)- Thus the 
Hamiltonian is decomposed in the form of (2.11) by iden- 
tifying 6 and ((i,j),<7), i.e., u and {{i, j),g,k). Then, the 
algorithm follows from the prescription given in § 2.1. 

For example, the graph assignment probability t{G\S) 
in (2.8) becomes 



i(l|5„) = (At) X a(g) U'^ Aij{g) V« 



(2.22) 



for Su being a non-kink, i.e., i/i^j — tpu- On the other hand, 
if Su is a kink, or ^'^ ^ ipu, then t{l\Su) = 1. Thus, for 
the case of s = 1/2 antiferromagnetic Heisenberg model, 
the probability is 



*(1|^«) = { 



1 



4(Ar) 







{Su is a kink.) 

{Su=[°o 1 ) or ( 1 ; 
(otherwise) 

(2.23) 

Choosing the value 1 for G„ means that we place a 
graph of the type g on the plaquotte u. For all the plaque- 
ttes with the value 0, we assign the 'identity', or 'trivial' 
graph (the top row in Table I) representing the identity 
operator. In what follows, we call a plaquette on which 
a non-trivial graph-element is assigned a vertex. 



Table I. The graph elements, and the matrix elements of the delta 
operators Aij(g) corresponding to each element. The base vec- 
tors of the two-spin Hilbcrt space are |^,— | — ^,^), 
and I — 5,^5). To emphasize the difference in the constraints 
indicated by the lines, dashed lines are used when the connected 
spins must be anti-paxallel to each other whereas solid lines con- 
nect parallel spins. 
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The procedure that realizes the probability T{S\G) is 
the same as that in the SW algorithm; first identify the 
points connected by the lines and then fiip each clus- 
ter with probability 1/2. (When applied to the isotropic 
Heisenberg model, the graph elements (?cb or ^hb in Ta- 
ble I do not appear, and the resulting clusters are simple 
loops. The name of the algorithm follows from this fact. 
In the present paper, we use the name even for the cases 
where clusters are not simple loops.) A 'space-time' point 
(■j, fc) plays the same role as a site i in the SW algorithm. 
An example of one step in the loop algorithm is depicted 
in Fig. 5 for the s = 1/2 antiferromagnetic Heisenberg 
model in one dimension. 

It is useful to see what kind of loops and clusters 
are formed^' in various other cases. We consider the 
XYZ model described by the Hamiltonian 



H 



Ti-ii — 



_ T qX qX _ J qV qV _ J qZ qZ 

c fJx^2 j y i j 'J z'^i *^ j • 



(2.24) 



where c is a constant. As for the orthonormal complete 
set, we take the set of the simultaneous eigenstates of the 
z-components of all spin operators as above. Therefore, a 
basis vector ip can be uniquely specified by the eigenval- 
ues of the N operators, , , • • • , Sf^. In this represen- 
tation, when Jx and/or Jy are negative, the ofl[-diagonal 



6 J. Phys. Soc. Jpn. 



Full Paper 



Author Name 



Fig. 5. A loop update. The decomposition into loops (from the 
left to the middle) and the flipping of the loops (from the middle 
to the right) are shown for the s = 1/2 antiferromagnetic Heisen- 
berg model in one dimension. The only non-trivial graph ele- 
ments arc the 'horizontal' ones, gn, in Tabic I. The loops flipped 
in the transition from the middle diagram to the right are indi- 
cated by dashed lines in the middle diagram whereas other loops 
are drawn with solid lines. 



matrix elements of —Ti. may be negative. For the bipar- 
tite lattices, however, the number of the negative matrix 
elements in the whole configuration is even, which makes 
the weight W{S) always positive. Another way of seeing 
this^*^ is to divide the whole lattice into two sub-lattices, 
A and B, so that a site on the sub-lattice A is surrounded 
by sites on the sub-lattice B, and rotate the spins on the 
sub-lattice B. For example, when = Jy < 0, we ro- 
tate spins on the sub-lattice B around the z-axis, so that 
Sf -Sf and Sf -Sf. This rotation makes all the 
off-diagonal elements positive. In what follows, therefore, 
we consider the cases with no negative-sign problem and 
assume that all the off-diagonal matrix elements of —Ti-ij 
are non-negative. 

Then, the pair Hamiltonian Ti-ij, eq. (2.24), with = 
Jy > can be expressed with two graph elements. We can 
see this in the following graphical decomposition analo- 
gous to (2.20), 



Aii(5c) + ^^^Ay(5cB) 



(I) 



-1-/ ■ — J Jx+Jz 



\ji9c) 



\M (II) 



^ iA,.(5H) + -^A,,(5HB) (III) 

where the constant c in (2.24) has been chosen so that 
the matrix elements are positive in each form. Since we 
need an expression of the form (2.19) with positive a{g), 
only one of these three expressions can be used for a 
particular set of the values of Jx and J^. The form (I) 
can be used for the easy-axis ferromagnetic model (0 < 
Jx = Jy < Jz), the form (II) for the easy-plane model 
{0 < \Jz\ < Jx = Jy), and the form (III) for the easy- axis 
antiferromagnetic model {0 < Jx = Jy < — Jz)- The five 
types of graph elements shown in Table I arc sufficient 
for expressing the pair Hamiltonian in all the three cases. 

The second and the fourth elements in Table I are 
required for expressing the pair Hamiltonian of the easy- 
axis ferromagnetic model (the case (I)). The fourth 
graph-element binds all the four spins at, aj, a^, and 
a J. Therefore, in this case, a resulting cluster of spins 
that is to be flipped simultaneously is not generally a 
single loop but a number of loops bound together. On 
the other hand, the second and the third graph elements 
are sufficient for expressing the pair Hamiltonian of the 



easy-plane model (the case (II)), such as the XY model. 
In either graph element, the four spins are bound only 
pair-wise. Therefore, a graph G consists of loops in this 
case. For the easy-axis antiferromagnetic model (the case 
(III)), the graph elements required are the third and the 
last. Therefore, in this cluster is not a single loop, 

in general, similar to the case (I) . For more details of the 
algorithm and the case with a lower symmetry (i.e., the 
XYZ model), see § 3. (A sample program may be found 
at a web-site. ^^) 

Many applications of the loop updating method have 
been done. Here, we only show the result for the quantum 
s = 1/2 XY model in two dimensions, ^^'"^^ which clearly 
demonstrates the utility of the loop algorithm. 

It is well-known that the helicity modulus exhibits the 
universal jump at the Kousterlitz-Tliouless type phase 
transition. The system-size dependence of the quan- 
tity near the critical point is also predicted theoretically. 
In the quantum spin model, such as the s = 1/2 XY 
model, the helicity modulus is related to the fluctua- 
tion in the total winding number of the world-lines by 
T = (r/2)(W2),35 where W = {Wx,Wy) with Wx (Wy) 
being the total winding number in the x (y) direction. 
Therefore, we can estimate the critical temperature ac- 
curately by measuring the winding number. In Fig. 6, the 
raw data of the helicity modulus is shown. We can see the 
universal jump even in the raw data and obtain a rough 
estimate of the transition temperature Tkt ^ 0.35. We 
can obtain a much more precise estimate for the critical 
temperature by fitting the data to the theoretically pre- 
dicted form of the size dependence. The best estimate 
of the transition temperature has been obtained in this 
way. Note that it is difficult to estimate the transition 
temperature by means of a conventional world-line quan- 
tum Monte Carlo method, such as the one shown in Fig. 
4. This is because the auto-correlation time becomes too 
long as we approach the critical temperature from above. 
Unlike the ordinary phase transition, it is increasingly 
more difficult to equilibrate the system even after pass- 
ing the transition temperature since the system remains 
critical in the whole low-temperature region. Therefore, 
with conventional methods, we can obtain reliable esti- 
mates of various quantities only in the high-temperature 
region. Another reason that makes difficult the estima- 
tion by the local update is that it does not yield an er- 
godic algorithm; the winding number of world-lines is 
not allowed to vary. Therefore, one must observe other 
quantities, for which the size dependence is known less 
precisely, or introduce some additional global updates for 
making the winding number vary. The latter was done 
in an early simulations,^^ and later in a simulation of a 
bosonic system. '^'^ However, these additional global flips 
tend to form bottlenecks in the configuration space, slow- 
ing down the whole simulation. 

2.4 Formulation Based on the Series Expansion 

So far we have been using the approximation of the 
imaginary-time discretization. While we can use the 
finite-L expressions for constructing an approximate al- 
gorithm in order to obtain all the results that we need, 
the results would come with a systematic error due to the 
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Fig. 6. The hehcity modulus (or the superfluid density) T = 
(T/2)(W^) as a function of the temperature. The universal jump 
is expected at the point where T = 2T/ttJ. The error bars 

arc drawn, but most of them arc too small to be recognized. 
(Adopted from Harada and Kawashima.'^'^) 

imaginary-time discretization. Therefore, we would have 
to do an extrapolation to get the final result free from 
this systematic error. However, there are two ways to get 
rid of the discretization error. In the first method, which 
we discuss in § 2.5, wc perform the extrapolation to the 
continuous imaginary time in the algorithm, not in the 
numerical results. In other words, there exists a compu- 
tational procedure that operates directly on continuous 
degrees of freedom (on the floating-point variables, to be 
precise). In this method there is no discretization error. 
In the present section, on the other hand, we present a 
method with discretized degrees of freedom that yields 
an algorithm with a much smaller error (negligible for 
most purposes) than the naive discretized-time method. 

The formulation is based on the high-temperature 
series expansion, and is originated in Handscomb's 
method. It was later elaborated by Sandvik and 
coworkers.^' '^^''^^ It starts from the expansion of the par- 
tition function, 

Z= lim Zl, 

where 

^ /?" 

ZL = y^ ^L.Tr(-W)". 

Then, wc visualize each term by considering L "boxes" 
and put n "marbles", each corresponds to —H, into these 
boxes. Each box can contain one marble at most. There- 
fore, there are ^ ^ ^ distinct pictures corresponding to 
the same term. Thus we have 

Z, = Y: ^^^Tr (f[{-H)A , 

{ik} ' \k=l / 

where 7/5 = 0, 1 represents a filled or an empty box, re- 
spectively. When the Hamiltonian is decomposed into a 
product of local factors as in (2.11), we can rewrite the 



above as 

where u = {b,k), Hu ^ Hb, G„ = 0, 1, G = {G„} 
and n(G) = G„. The summation is restricted to the 
graphs G such that G(f, j.) = 7fe < 1. As we did in the 
path-integral formulation to obtain (2.12), wc can insert 
the identity operators expanded in the orthonormal com- 
plete set to obtain, 

^l = ^W^l(^), 
s 

G ' u 

(2.25) 

Apart from the factor {L — n{G))\l L\ and the restric- 
tion on the sununation, eq. (2.25) looks similar to (2.15). 
In fact, when i ^ 1, the difference in the factor is small 
because P"{L — n)l/Ll is approximately equal to (/3/L)". 
In addition, for large L, the difFcrencc due to the re- 
striction produces only a small difference, because the 
typical value of n(G) in an actual simulation should not 
depend on L (as long as L is large enough) and there- 
fore having more than one vertices in the same layer is 
an increasingly rare event for large L even if such an 
event is allowed. Therefore, eq. (2.25) derived from the 
series expansion is approximately equal to (2.15) derived 
from the path- integral formulation for the same L, and 
they become identical in the large-L limit. This means 
that the algorithms that follow are similar for a finite 
L and identical for the infinite L. The important dif- 
ference, however, is that the discretization error for a 
finite L is exponentially small for (2.25) whereas that 
for (2.15) vanishes only algebraically. Therefore, in the 
path-integral formulation, we need to take the infinite-L 
limit whereas in the series-expansion formulation, it is 
not necessary as long as L is large enough. 

All the algorithms, such as the loop algorithm and the 
directed-loop algorithm discussed below, can be derived 
from the series-expansion formulation as well as the path- 
integral formulation, and in most cases the mapping from 
an algorithm derived from the latter to the one derived 
from the former is straightforward. While a "discretized- 
time" algorithm based on the series-expansion represen- 
tation can be advantageous for efficient implementation 
(because only integral variables are needed there), we 
discuss in what follows a number of algorithms using the 
path- integral formulation to avoid the factor {L — n)\/L\ 
appearing in the expressions. 

2.5 Continuous- Imaginary- Time Limit 

The loop algorithm is useful not only in speeding up 
the simulation but also in taking the L —>■ go limit in the 
algorithm,^ which makes the algorithm free from the sys- 
tematic error due to the discretization of the imaginary 
time. 

For a large L, the target distribution Wl{S) of the 
spin configuration mimics the distribution W{S) = 
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Fig. 7. The correspondence between the world-line configuration 
in the discrete time and that in the continuous time. When, a 
loop algorithm is used, a real-time "animation" on the computer 

screen would look the same for both the representations as long 
as the imaginary time step Ar in the discrete representation is 
small enough. A uniform interval (UI) is indicated as a lightly 
shaded region. 



placed a vertex. The four corners of a plaquette are called 
legs. (See Fig. 21 for the names of various objects.) 

It may be helpful to summarize here the procedure 
of one Monte Carlo step with the continuous-imaginary- 
time loop algorithm. Starting from an arbitrary pair of 
S and G that match each other, first we remove all the 
vertices (i.e., graph clcnicmts) at which there is no kink. 
Next, for each pair of the nearest neighbor sites (ij), we 
decompose the interval (0, (3) into uniform intervals (UI). 
(Here, a UI for a pair of sites is an imaginary-time 

interval delimited by two kinks that involves one or both 
of i and j. (See Fig. 7)) For each UI, which we denote as /, 
and for each kind of graph elements, which we denoted as 
g, we generate an integer n with the Poisson distribution 
of mean Ia{g), and place n graph elements of the type g 
uniform-randomly in /. When this is done for all types 
of graphs, all the uniform intervals, and all the nearest 
neighbor pairs, we identify loops, or clusters. Finally we 
flip each loop (cluster) with probability 1/2. 



limL^oo Wl{S). It means that if we look at a configu- 
ration with a poor resolution in the imaginary time, we 
cannot tell whether the configuration is generated with 
the weight W{S) or Wl{S). Therefore, when we have a fi- 
nite correlation time in the target distribution W{S), 
we do not have a kink at which a state changes, i.e., 
tpb+iik) ^ i^bik) in (typically) ^t/^t consecutive layers. 
Let us consider an imaginary-time interval of the length 
I that includes many layers with no kink. As can be seen 
in (2.22), we assign a graph element of type g with prob- 
ability (Ar)a(gi) to a unit u when Su makes the matrix 
element of A(g) imity. Since there arc //At layers in this 
interval, the probability of assigning n graph elements of 
type g to this interval is given by 

( ^/(f) )((Ar)a(5))"(l-(Ar)a(,))^/(^^)-". 

In the continuous- time limit L — » oo this reduces to 

i(7a(9))"e-^«(^). 

This is nothing but the Poisson distribution with mean 
Ia{g). Therefore, instead of repeating the graph assign- 
ment procedure for all the plaquettes, wc can generate a 
number n with the Poisson distribution with mean Ia{g), 
and choose n points from / uniform-randomly. The result 
would be statistically the same as what we would obtain 
from the discrete-time procedure described in § 2.3 (with 
extremely large L). 

Another advantage of the continuous-imaginary-time 
algorithms is that we do not have to deal with the fine 
structure of the 'space-time'. For example, the time or- 
dering of the plaquettes with different b in each layer 
(such as the one shown in Fig. 3) can be arbitrary, be- 
cause in the continuous-time limit, individual plaquettes 
do not appear and therefore the order of them does not 
matter at all. 

Since we consider the At limit in what follows, 
the height of a plaquette is zero, i.e., a plaquette in the 
discrete time corresponds to a horizontal line. We call the 
horizontal line (plaquette) on which a non-trivial graph is 



2.6 Large Spins 

The generalization of the loop algorithm to larger (i.e., 
higher) spins can be done by replacing each spin operator 
by the sum of 2s s = 1/2 spins. ^ That is, we replace the 
spin operators in (2.24) as 

2s 

where each spin operator at carries s = 1/2. Accordingly, 

a basis vector is specified by eigenvalues of the 2sN op- 
erators {cr?^} (z = 1, 2, • • ■ , and /x = 1, 2, • • • , 2s). In 
what follows, we identify the label 'ipb{k) with a set of 
2sA'^ variables {(7^^}, where (t^ ^ = ±i denotes an eigen- 
value of af^^. The new Hilbert space spanned by these 
vectors has the dimension 2^*^, somewhat larger than 
the original one which is spanned by only (2s -I- 1)^ ba- 
sis vectors. Therefore, we have to eliminate many states 
in order to obtain the correct partition function of the 
original model. This can be achieved by introducing the 
projection operator p.40-4i ^ 

Z = Tv(e-'5«({'^'})) =Tv(pe-'^«({^-})) . (2.27) 

The projection operator P eliminates all the states that 
do not have corresponding states in the original problem, 
such as the singlet states in the s = 1 problem. 

When the original spins are split into s = 1/2 spins, 
the pair Hamiltonian Hij can be written as 

2s 2s 
IJ.=1 i/=l 

where Wi^jv if' the pair Hamiltonian that can be ob- 
tained by replacing Si and Sj by er^^ and (Tjv, respec- 
tively, in the definition (2.24). The pair Hamiltonian 
is nothing but the pair Hamiltonian of the s = 1/2 
model discussed above. It is thus obvious that we can ap- 
ply the general prescription described in § 2.2, § 2.3 and 
§ 2.5 simply by re-interpreting b as v), g) instead 

of {{i,j),g). As a result we have 2s vertical lines for each 
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site i as illustrated in Fig. 8. The graph-assignment pro- 
cedure must be repeated (2s)^ times corresponding to 
(2s)^ pairs of the indices /x and i/. The procedure is oth- 
erwise identical to the one for the s = 1/2 model. The 
types of the graphs and the graph-assignment density are 
exactly the same as the corresponding s = 1/2 model. 

Wc can handle the projection operator P through a 
graphical decomposition as we do for the Hamiltonian. 
It should be noted that the operator P projects the ex- 
tended Hilbert space onto the sub-space that is isomor- 
phic to the original Hilbert space. This sub-space consists 
of the simultaneous eigenstates of the Cashmir operators 
(Si)"^. The states must be symmetric in the /x space for 
each site. In other words, the state is invariant under 
any permutation of the split spins {<Tis, cri,2, ■ ■ ' ,cri,2s) 
for each i. Therefore, the projection operator is a product 
of local projection operators, and each local projection 
operator can be expressed as the sum of permutation 
operators; 



permutation grapli 



P = l[Pi, Pi 



1 



{2s)\ 



Here, the summation is taken over the set of permu- 
tations among the split spin indices /i = 1,2, ••■,2s, 
and is an operator that generates the permutation 
TT. Specifically, 



2s 



where ij) = ((7j,i, (Ti,2, • • • ,crj,2s)- This operator corre- 
sponds to a graph element that connects a point on the 
vertical line specified by (i, fi) to a point on the other line 
specified by {i,TT(fi)). This correspondence is similar to 
that of the operator Aij (g) and the graph element g as we 
see above. Therefore, in order to take the projection op- 
erator into account, wc have only to include the following 
step in the updating procedure. That is, after assigning 
graph elements to the vertices, we assign a special graph 
clement to the end points of the 2,s vertical lines for each 
site i. Each graph element represents a particular per- 
mutation of 2s spins and connects the end points of the 
2s vertical lines at r = /3 to those at t = 0. The graph 
element is chosen with equal probability from the ones 
that are compatible to the current spin configuration (at 
T = (3 and t = 0). 

Among a number of applications of the split-spin al- 
gorithm described in this section, we briefly mention the 
calculation done by Todo and Kato,^'^ since it is illustra- 
tive of the high efficiency of the algorithm. They com- 
puted the energy gap A between the ground state and 
the first excited state, and the correlation length ^ of 
the antiferromagnetic Heisenberg model in one dimen- 
sion at T = for s = 1, 2 and 3. This system is known to 
exhibit the Haldanc gap and is disordered even at zero 
temperature. The correlation length for s = 1 is about 6. 
Therefore, one can use the exact diagonalization for ob- 
taining a rather accurate estimates of various quantities 
in this case. However, since the inverse gap and the cor- 
relation length diverge exponentially as the spin length 
increases, it is increasingly difficult to obtain accurate 



vertices 




T=0 



T=0 



origiani spins split spins 



Fig. 8. The split-spin representation of a world-line configuration 
for the s = 1 quantum spin system. 



estimates for larger spins. They obtained the following 

estimates: 



i = 6.0153(3), 
^ = 49.49(1), 
C = 637(1), 



A = 0.41048(6) (s = 1), 

A = 0.08917(4) (s = 2), 
A = 0.01002(3) (s = 3). 



(2.28) 



It is obvious from this result that we cannot compute 
these quantities with the exact diagonalization method 
for s = 2 and 3. To our knowledge, these numbers are 
very difficult to compute by any other methods than the 
ones described in this article. The estimates for s = 2, 
for example, are the best estimates known so far. For 
the estimates for s = 3, we are not aware of any other 
methods that can compute them. 

2.7 Loop Algorithms with Non-Binary Loops 

In some applications, it is advantageous to use non- 
binary loop variables. For example, let us consider the 
bilinear-biquadratic interaction model with s = 1,^^ 



n = J^((cos6')Si • Sj + {sme){Si ■ Sjf). 

{ij) 



(2.29) 



Simulation of this model can be done with the split-spin 

method described in § 2.6 with or without the coarse- 
graining in § 2.11 and the details can be found in § 3. (For 
an application, see Harada and Kawashima.^^) In what 
follows, however, we consider an alternative algorithm 
which is particularly useful in dealing with special cases 
with higher symmetry. 

The model (2.29) obviously has the SU(2) symmetry. 
At 9 = ±7r/2 and 9 = ±7r/4, however, it possesses a 
higher symmetry than is obvious from the definition. 
Here we consider the case 6 = — 7r/2 for which 

'H = -jY,{iSi-S^f-l), (2.30) 

(u) 

where the constant —1 is added for convenience. The 
Hamiltonian (2.30), as well as the Hamiltonian at other 
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special values of 9, has the SU(3) symmetry. Using the 
ordinary representation basis \ai) (cTj = — 1,0,+1) 

the pair Hamiltonian can be rc-written as 

-n^J = J X S X A(gH)- (2.31) 

Here, S is the operator that carries the sign, whose ma- 
trix element is +1 or —1, and is —1 if and only if one 
of the initial state {(Ji,aj) and the final state {a[,(j'^) is 
(0,0) and the other is (1, —1) or (—1, 1). It is easy to see 
that this sign is irrelevant since the negative signs always 
occur in pairs leaving the sign of the whole system posi- 
tive. Therefore, we can simply neglect the operator S, as 
we do in what follows. The operator A((7h) is defined by 
its matrix elements, which we denote by A(S'„,5h) and 
are defined as 



(o--,cr'|A(5H)|crj,crj) 



= { 



1 (if Ui + Uj 



0) 



(otherwise) 

This is almost identical to (2.21). The only difFcrcncc is 
that the present operator is defined on a larger Hilbert 
space (o-j = —1,0,1) than the previous one {ai = 
— 1/2, 1/2). It is therefore obvious that the present prob- 
lem is a generalization of the ordinary SU(2) antifer- 
romagnetic Heisenberg model. The constraint imposed 
by A(5'h) upon the world-line configuration can be ex- 
pressed by the same graph element as the one in the 
SU(2) case, i.e., the horizontal graph in Table I. The 
local spins bomid by the graph must take the values 
complementary to each other (such as +1 and —1, or 
and 0). Therefore, once a loop has been formed, the 
local spin value must be everywhere along the loop or 
it must alternate between -|-1 and —1. As in the SU(2) 
case, choosing a local spin state at one point of the loop 
determines the state of the whole loop. The difference 
is simply that every loop can take three possible states 
rather than two. The loop flipping process must be al- 
tered accordingly when we consider the loop algorithm 
for the present model; we must choose one state among 
three possible ones with equal probability for each loop. 
All the rest of the procedure remains the same. For exam- 
ple, the graph assignment is done in the same way as the 
SU(2) case; the horizontal graph elements are assigned 
with the density J between two nearest neighbor sites if 
the local spin values at the two sites are complementary 
to each other. 

A similar algorithm can be constructed in other cases 
with lower symmetry, i.e., the SU(2) symmetry, for the 
parameter region — 37r/4 < < — 7r/2. We start from the 
expression^^ 

-Hij = J ((-sin^ + cos^)A(5h) - (cos^)A(5c)) , 

(2.32) 

where an irrelevant sign and an additive constant have 
been omitted. The symbol A{gc) corresponds to the 
cross graph in Table I. To be more specific, its matrix 

elements are 



A{Su,gG) 



1 (if ai = a; 
(otherwise) 



and ' 



^1) 



The loop construction and the loop hipping can be done 
in much the same way as described above. 

These algorithms can be easily generalized to the case 
where each local spin variables takes N possible values, 
i.e., ai = i-N + l)/2, {-N + 3)/2, • • • , (iV - l)/2. Of 
particular interest is the Hamiltonian that consists of 
A(5h) only, which possesses the SU(7V) symmetry. See 
the reference''^ for results of a numerical simulation. It 
should be also pointed out that the algorithm presented 
here is similar to the Swendsen-Wang algorithm^^ for 
the classical antiferromagnetic Af-state Potts model, in 
which a cluster is constructed in much the same way as 
the SW algorithm for the Ising model, and each cluster 
can take A'" diflFerent states. 

2.8 Magnetic Field 

For a number of models, the loop algorithm described 
above is the most efficient algorithm among the ones de- 
scribed in the present article. For instance, the easy-axis 
XX Z model with general spin size s can be best handled 
with the loop algorithm. The easy-plane XXZ models 
can also be simulated most efficiently with the loop al- 
gorithm if there is no external magnetic field parallel to 
the diagonalization axis, namely, the z axis in the present 
case. However, if we have such an external magnetic field 
and it is competing with the spin-spin couplings, the loop 
algorithm does not work.^ 

To see this, we first describe a simple loop algorithm 
for a case with magnetic field in the ^-direction, and see 
what makes it inefficient. In the simple algorithm, we deal 
with the magnetic field separately; we simply neglect the 
external field while assigning graph elements. Then, in 
flipping clusters or loops, we take it into account. This 
can be formally justified as follows. First decompose the 
Hamiltonian into the field-free part and the field 

part n^^\ 



n 



(1) 



Then, we have 



Wl{S) = UUb+iik) 



b,k 



-AtH 



(0) 



AtH. 



(1) 



Mk)). 



xl[{Mk) 

i,k 

Y,wi'\s,G)V{S) 



where we have assumed that the field part is diagonal. 
The factor V{S) is the contribution from the magnetic 
field term defined as 

where 

J-(fcAr) = (Vi(fc)|(-Wf^)|Vi(fc))- 
The factor V{S) can be rewritten in terms of clusters as 
V{S) =\{V,{S), 
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where Vc{S) is defined for each cluster c in G as 

Here dX is the {d + l)-diniensional integral in the 

space-time over the cluster c. 

It is obvious from this form that the magnetic-field 
term does not affect the graph- assignment probability 
(2.7) whereas it modifies the cluster-flipping probability. 

Substituting W{S,G) = W^^\s,G)V{S) for (2.7), we 
obtain 

T(S|C).A(5,G)n^X<^, 

where Sc is the specifier of the state of the cluster c. (For 
the s = 1/2 spin models Sc is a binary variable.) This 
indicates that we have to choose the cluster state Sc with 
the probability V{Sc)/J2s' V{S'c) for each c. When the 
external field is zero, this reduces the random unbiased 
choice between two possible cluster states as already ex- 
plained above. 

This procedure works well when the magnetic field is 
cooperative with the spin-spin couplings as is the case 
with the easy-axis ferromagnetic XXZ model with a 
uniform magnetic field parallel to the axis. However, if 
the field is competitive against the spin-spin couplings, 
as is the case with the antiferromagnet with a uniform 
field, the procedure becomes increasingly inefficient as 
the temperature is lowered. To see this, we consider a 
small system that consists of only two spins coupled 
with each other by an antiferromagnetic interaction. Let 
us first suppose that spins are totally aligned in the 
direction of the magnetic field. Because of the graph- 
assignment role presented above, the density with which 
we assign the non-trivial graph elements is zero. The re- 
sulting graph is therefore a trivial one that consists of 
two loops, i.e., two vertical lines going from the bottom 
of the system to the top. In order to visit a different spin 
configuration, we have to flip at least one of these two 
loops. However, the flipping must be done against the 
magnetic field, and the flipping probability is roughly 
proportional to e^^^ according to the simple procedure 
discussed above. Here H is the magnetic-field strength. 
Therefore, flipping seldom takes place at a low temper- 
ature regardless of the magnitude of J relative to H. 
However, we need to visit other states frequently, partic- 
ularly when J is much larger than H. When H is much 
larger than J, on the other hand, we need to visit the 
completely aligned state frequently. However, it hardly 
happens if we start from the anti-parallel state in which 
one of the two spins is up and the other is down because 
the transition probability to the completely aligned state 
is exponentially small at a low temperature. This is be- 
cause we cannot change the total magnetization unless 
we flip a loop whose temporal winding number is not 
zero; but such a loop can be formed only when no non- 
trivial graph elements are assigned to the system. Such 
an event takes place with an exponentially small proba- 
bility proportional to e"^''/^, regardless of H. In short, 
when the magnetic field competes with the other cou- 
plings, the transition probability from one value of mag- 




Fig. 9. Three elementary movements and their anti-movements 
of the head. At the positive head labeled '-|-', the local spin vari- 
able increases by one whereas at the negative one labeled '— ', it 

decreases by one. The thick line, therefore, is the part where the 
spin value is greater relative to the thin part. 

netization to another becomes very small at low temper- 
atures, making the simulation extremely slow. 

2.9 Worm Algorithm 

There are cases where one can avoid the freezing prob- 
lem d\ic to the magnetic field by using the worm algo- 
rithm.'^'^ Updates of the world-line configuration in the 
worm algorithm is done through stochastic movements of 
two discontinuity points at which the conservation rule is 
violated. In the case of particle-hole problems or s = 1/2 
quantum spin problems, a world-line terminates at these 
points. Only one of the two points moves aroimd in our 
implementation, and we call the mobile one the head of 
the worm and the other stationary one the tail. A worm 
is the pair of these two points. (A worm in the present 
paper does not have a 'body', in contrast to real ones.''^) 
The spin configuration is modified as the head moves 
around. There can be several types of heads depending 
on the change in the local state caused by them. In many 
applications, however, we only consider two types; the 
one for which the local state above the head is one higher 
than that below (positive head), and the one for which 
the opposite is true (negative head). The types of the tail 
are defined likewise. 

One step in the worm algorithm consists of three ele- 
mentary movements and their anti-movements: the cre- 
ation/annihilation of a worm, the vertical movement, and 
the jump /anti-jump, as illustrated in Fig. 9. Each move- 
ment is a stochastic transition that satisfies the detailed 
balance condition with respect to the weight in (2.14) 
with an additional contribution from the source term. 
That is, we consider the Hamiltonian TC — rjQ, where 
the operator Q represents the source and is the sum of 
local operators, Q = J2i Qi- The partition function is 
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expressed in the discrete imaginary time as 
Z = lim yWL{S), 



(2.33) 



X n(^^ 1(1 + (A^)^Q-)I^-) '(2-34) 

V 

whoro ?; specifics a local unit defined on a vertical line 
i so that every layer contains exactly one unit for each 
vertical lines. The symbol stands for Qi as W„ does 
for Tib in (2.14). The right-hand side of (2.34) consists 
of three parts; the contribution from the diagonal ma- 
trix elements of the Hamiltonian, the contribution from 
the off-diagonal matrix elements, and the contribution 
from the source term. Specifically, denoting the number 
of kinks as TVkink and the number of discontinuity points 
as A^dc, we can rewrite the weight as 

Wl{S) = WdX {At)'^''""'Wk X (Ar)^'^<^WvK, 



Wd = 



n (V;i(l-(Ar)W„)|V„) 



u:non— kink 



= exp 



Wk = 



;:kink 



Ww = H'WJvQvl^v). 



v.dc 

Since we only need up to the second order in rj, we trun- 
cate the last product at the second order, which is indi- 
cated by the prime in Ow dc- what follows, we consider 
only configurations with no worm or those with exactly 
one worm. 

The detailed balance condition 



T{S'\S)Wl{S) = T{S\S')Wl{S') 



(2.35) 



has to be satisfied by the transition matrices expressing 
three elementary movements of the head. 

In the creation process (Fig. 9(a)), we first choose a 
site «, a uniform interval of it, say /, and two tempo- 
ral positions in I, t and r', for placing the worm. Then 
we decide if the proposed placement is accepted. In this 
process, Ww and Wd are altered while Wk remains the 
same. Let the probability of choosing / be P{I), the prob- 
ability of choosing x and y in the intervals dx and dy 
respectively, P{x,y) dx dy, and the probability of accep- 
tance, Pcreate- For the invcrsc process, namely, the anni- 
hilation, we do not have to choose J, t or t'. We simply 
decide whether we erase the worm or not with probabil- 
ity -Pannihiiate- Thus the detailed balance (2.35) in this 
case can be rewritten as 

^P„eate X P{I) {dr dr' P{t, t')) Wd{S) 

= Pannihiiate X Wd{S') {dr dr'WwiS')) (2.36) 

where S is the state with no worm and 5" is the state 
with a worm whose head is at x and the tail y. The 



factor 1/2 on the left-hand side is due to the two possi- 
bilities concerning the initial type of the worm. Here we 
obviously have many degrees of freedom. One of many 
possible choices, though it may not be the optimal, is 
given by setting P{I) = |/|/(iV/3), which corresponds to 
choosing the interval I by "throwing a dart" . Then, we 
obtain 



-'create 
^annihilate 



R 



2Np Jj dx dy Wd iS')Ww{S') 



and 



dxdy P{x,y) = 



Wd{S) 



dxdyWD{S')Ww{S') 
jjdxdyWD{S')Ww{S'y 



(2.37) 



To be more specific, the acceptance probabilities can be 

chosen as 



Pr. 



min(l,P), Pannihiiate = min(l,i? ^), (2.38) 



and the free parameter -q is adjusted so that neither of 

Pcrcatc 

nor Pannihiiate is tOO SmaU. 

In practice, it is often too cumbersome to compute 
(2.37) every time a creation or an annihilation is pro- 
posed. Therefore, an alternative may be used. That is. 



dxdy P{x,y) = 



dx dy exp i 



y\ X AV) 



Jj dx dy exp {—\x — y\x AV) ' 



(2.39) 



where AV is the average excess action (per unit time) 
caused by the creation of the worm. 



AV^-l-m 



w, 



D 



{Sim 



Wd{S) J ' 



where S{I) is the world-line configuration that results 
from creating the worm with the tail at the bottom and 
the head at the top of the interval /. When this alter- 
native is used, R in (2.38) must be modified accordingly, 
so that the detailed balance condition (2.36) is satisfied. 
(As a result, the new R depends on the times r and r'.) 

The vertical movement (Fig. 9 (b)) is much simpler. 
The head moves to another point of the vertical line on 
which it is currently located. The new position is cho- 
sen from the interval / that contains no kink in it and is 
delimited by two kinks. The choice is made with an ap- 
propriate density so that the detailed balance is satisfied. 
Since the kink contribution Wk and the worm contribu- 
tion Ww are the same for the initial and the final state 
of the move, we have only to consider the diagonal part 
Wd for the detailed balance. Namely, the detailed bal- 
ance (2.35) is satisfied if the probability rfrPverticaK''') of 
choosing the new position of the head in the interval dr 
is c?TPverticai(''") Wd{S') where 5" is the state after 
the head position is moved to r. Therefore, PverticaKT) 
should be 

drWDiS') 



dTPvertical(T) 



JjdrWDiS') 



Finally we consider the jump and the anti-jump (Fig. 
9(c)). A jump is a movement in which the head changes 
its spatial position while the temporal position is kept. 
At the same time, a kink is created in a jump process 
between the two vertical lines. There are two kinds of 
jumps according to the temporal location of the kink to 



J. Phys. Soc. Jpn. 

be created; whether it is above the head or below. In the 
original article,^ one of the two is called a reconnection. 
We do not distinguish the two, since both the movement 
can bo done in exactly the same way. The anti-jump, 
too, has two kinds according to the position of the kink 
relative to the head. The detailed balance in the jump 
process can be worked out in a fashion similar to the two 
cases discusses above. This time, all three of Wd, Wk 
and Ww change. The detailed balance condition is 

Pjump X {P{x)dx)Ww{S)WD{S)WK{S) (2.40) 

= -Pa„ti-jump X Ww{S')WD{S'){WK{S')d(^,Al) 

where S' is the state after the jump. Pjump and Panti-jump 
are the probabilities of accepting a proposed jump and 
anti-jump, respectively, and P{x)dx is the probability of 
choosing the position of the new kink in the infinitesimal 
interval dx around x. We choose 

j^dxWn{S')WK{S'y 

Then, the acceptance probabilities must be chosen so 
that 

^jump _ Ww{S') Jj dx Wn{S')WKiS') 

i^.„li-jump WwiS)WDiS)WK{S) 

is satisfied. Then, one possible choice for the acceptance 
probability is 

-Pjump = min(l, R), Panti-jump = min(l, R~^). 

A few comments on the worm weight may be appropri- 
ate here. In general, we can assign a non-trivial weights 
to the head and the tail. A frequent choice is 

(V^IQ.IV-.), (2.42) 

where tjjy and V'i are the local spin states just below the 
head (or the tail) and the above, respectively, and Q„ is 
an operator that represents the order parameter relevant 
for the model. For example, for the XY model Qy = 
is used. In the s = 1/2 case, in particular, the weight is 
a constant. The reason for the choice (2.42) is obvious, 
considering the relationship between the head's trajec- 
tory and Green's function Vq{X' - X) = {Q{X')Q{X)), 
with X and X' specifying space-time points. (Sec §2.1G 
for estimators of various quantities.) When the worm is 
assigned the above-mentioned weight, it can be shown 
that Tq{X) is proportional to the frequency with which 
the head visits a location specified by X relative to the 
head's original location. Therefore, if the range where 
Tq{X) of 0(1) is determined by the system's correla- 
tion length, the head's trajectory extends a region whose 
linear size is roughly equal to the correlation length. 

This is desirable since this guarantees that the scale of 
the update coincides with the correlation length. This is 
also the case with the loop algorithm with no external 
field. However, the worm algorithm works better than 
the loop algorithm when a competing external field ex- 
ists. This is because the effect of the field is reflected in 
choosing each local movement of the head. Therefore, a 
typical trajectory of the head strongly depends on the 
strength of the field. In the loop algorithm, on the other 
hand, the loop construction is done with no reference to 
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the external field, making the typical loop, which cor- 
responds to the trajectory of the head, depends on the 
external field only indirectly. As a result, the acceptance 
ratio of the loop flipping can be extremely small in the 
loop algorithm whereas the acceptance is always unity 
in the worm algorithm (the local spin state along the 
trajectory is already changed when the head finishes its 
journey). 

2.10 Directed-Loop Algorithm 

The directed- loop algorithm^' ^ can be thought of as a 
hybrid of the loop algorithm and the worm algorithm. 
While it has an advantage of the worm algorithm, we do 
not need to do integrations for obtaining the transition 
probabilities. In addition, although the directed-loop al- 
gorithm becomes identic;al to the loop algorithm when 
the external magnetic field is zero, it does not have the 
freezing problem even when the field is turned on. 

The directed-loop algorithm can be formulated in 
much the same way as the formulation of the loop algo- 
rithm. Therefore, we start with (2.12) (or (2.14)). In the 
loop algorithm, we have decomposed the local Hamilto- 
nian into several terms, each corresponding to a particu- 
lar graph element. In addition, we split each original spin 
into 2s Pauli spins in the case of s > 1/2. Therefore, b in 
(2.11) is equivalent to ((ij), {iJLv),g). In the directed-loop 
algorithm, we do not decompose the local Hamiltonian 
at all. Accordingly, h in (2.11) must be regarded sim- 
ply as (ij). Then, the procedure of updating G follows 
from the general prescription in § 2.2. For example, we 
set Gu = 1 for a given u with probability (At)w(S'i,) = 
{AT)((ip'J\{-Hij)\ii)u) when = This means, in the 
continuous-time formulation, that vertices (which corre- 
spond to —Ti-ij here, rather than A (5) in the loop al- 
gorithm) are placed with the density (V'u|(— '^y)|V'«) in 
uniform intervals. In addition, a vertex is placed on every 
kink. 

The updating procedure for S, on the other hand, 
is quite different from that in the multi-cluster variant 
of the loop algorithm discussed in the previous subsec- 
tions. There, clusters are formed naturally as a result of 
the graph assignment because the Hamiltonian has been 
dcx;omposccl into graph elements. Since we do not have 
graph elements in the directed-loop algorithm, loop (clus- 
ter) must be formed in the ^-updating process rather 
than in the G-updating process. 

While the ^-updating is done with a worm in the 
directed-loop algorithm similarly to the worm algorithm, 
the head of the worm in the directed-loop algorithm can- 
not choose the positions at which it creates kinks unlike 
the worm algorithm. This is because G has been fixed 
(i.e., all the vertices are fixed) before the worm is cre- 
ated, and we cannot have a kink at a plaquette on which 
there is no vertex, i.e., G„ = 0. Therefore, new kinks 
can be made only at the vertices which are fixed dur- 
ing the worm's life-time. However, this is not an essen- 
tial difference because one can easily generalize^ the al- 
gorithm so that the vertices are generated dynamically 
during the head's motion. Another (probably more im- 
portant) difference between the directed-loop algorithm 
and the worm algorithm arises from the direction of the 
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Fig. 10. The four events that can happen when a head hits a 
vertex. Arrows indicate the directions of the heads' motion. The 
numbers indicate the local spin states. Namely, I = 0,1, ■ ■ ■ ,2s 
correspond to the local spin states Sf = —s, — s + 1, • • • ,s, re- 
spectively. 



head's motion. In the worm algorithm, the direction of 
the head's motion is biased only by the weight W{S. G) 
and there is no algorithmically preferred direction. In the 
directed-loop algorithm, on the other hand, the head has 
a "moment of inertia" and can go only in the direction 
that is the same as in the previous step. The head can 
change its direction of motion only when it is scattered 
by a scatterer, i.e., a vertex. Therefore, Gu = 1 can be 
interpreted as having a scattering object at u. This is 
a clear advantage of this method compared to the worm 
algorithm, because in the worm algorithm, a head in gen- 
eral goes back and forth along a vertical line, sometimes 
unnecessarily. When applied to the s = 1/2 antiferro- 
magnetic Heisenberg model, for example, the trajectory 
of the head is roughly the same as the loop in the loop 
algorithm when the field is absent. Therefore, the head's 
motion in the worm algorithm is a random walk along a 
loop. While it takes a time proportional to the squared 
length of the loop for a head to finish its travel in the 
worm algorithm, it takes only a time proportional to the 
length in the directed-loop algorithm. 

When the head arrives at a vertex, it may or may not 
change its location as well as the direction of motion. It 
has four possibilities as to the location after the scat- 
tering, namely, the four legs of the vertex (Fig. 10). The 
choice among the four is made probabilistically. However, 
unlike all the cases discussed above, we cannot use the 
detailed balance condition for determining the probabil- 
ity due to the direction of the head's motion. It is obvious 
that the probability of having the left-most state in Fig. 
10 as the final state is zero when the initial state is one 
of the four states on the right, because of the direction of 
motion. (The head is moving away from the vertex, not 
coming in.) Instead of the detailed balance, we use the 
time-reversal symmetry condition as we discuss below. 

The stochastic process of the directed-loop algorithm 
can be formally viewed as the stochastic process in the 
c^xtended state space. The extension of the state is done 
in two ways. As mentioned above, the first extension is 
due to the introduction of the auxiliary variable G, and 
the other is due to the introduction of a worm. Since the 
directed-loop algorithm is a kind of single-cluster algo- 
rithm similar to the Wolff algorithm, the whole stochastic 
process is not a simple alternating Markov chain as in the 
loop algorithm (Fig. 1). As illustrated in the top part of 
Fig. 11, the probability of generating a new state S{n-\-l) 
depends not only on the current graph G(n), but also on 
the current state S(n). This is in contrast to the multiple- 
cluster variant of the loop algorithm that corresponds to 
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Fig. 11. A schematic illustration of the directed-loop Monte 
Carlo. The top, middle, and bottom parts show an overview, 
one step, and one cycle, respectively 



Fig. 1. This updating process of the spin configuration is 
achieved by a number of worm creation/annihilation cy- 
cles. Each cycle starts with a state that contains no worm 
and ends with another worm-free state. Let us denote the 
initial state So and the final state Sq where q stands for 
the number of elementary motions of the head during 
the life-time. Each state between the two, Si,S2, - ■ ■ , or 
Sq-i, contains a worm. 

Let us denote the transition probability that governs 
the elementary head motion as Ty,{S'\S). (Here we have 
dropped the dependence on G of the transition probabil- 
ity because it is fixed throughout the cycle.) Instead of 
the detailed balance condition, this transition probability 
is chosen so that it satisfies the time-reversal symmetry 
condition 

T^{Sk+i\Sk)WL{Sk, G) = T^{Sk\Sk+i)WL{Sk+i,G), 

(2.43) 

where S is the state identical to 5 except the direction of 
the head's motion. In other words, S is the time-inversion 
of S. Note that the weight of a state does not depend 
on the direction of a head. Once (2.43) is satisfied, the 
ordinary detailed balance condition is recovered in the 
process from 5*0 to Sq, i.e., 

T{Sq\So)WL{So,G) = T{So\Sq)WL{Sq,G). 

This can be seen easily as follows. First we note that 

T{Sg\So)W{So,G) 

Q Si S2 Sq-l 

TT^{Sq\Sq-l)Tyj{Sq-l\Sq-2) ' ' * 

T^{Si\So)Wl{So,G) 

But because of the direction independence of the weight, 
by using the time-reversal invariance of the transition 
matrix (2.43) repeatedly, we obtain 

T^{Sq\Sq-{)T^{Sq-^\Sq-2)---T^{S^\So)WL{So,G) 

= T^(5o|5i)Tw(5i|52) • ■■T^{Sq-^\Sq)WL{Sq,G). 
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Thus, the detailed balance is recovered for every individ- 
ual path that leads from to Sq. 

Here we consider the weight of the states with worms. 
Since the worm is an artifact for the algorithm, in princi- 
ple we can assign any weight to the states with a worm."'^ 
The most natural definition, however, is to use the same 
expression as (2.14) with an additional factor for the 



worm 



46 



«^w(5x)=^(V';i^flV'x), 



(2.44) 



where x is h or t corresponding to the head or the tail, 
respectively. The local state is defined as (V'x)V'x), 
where tp^ and V'x are the local spin states just below and 
above the discontinuity point, respectively. The constant 
T] is included for adjusting the worm creation and anni- 
hilation probabilities. A similar factor for the tail is also 
included. The weight of a state with a worm altogether 
becomes 

Wl{S) = wUSh)w^{St) 

xll{i;'J{l-{AT)Hu)\^u), 

u 

where 5*11 and St arc the local states around the head 
and the tail, respectively. The product is taken over all 
the vertices (plaquettes). Note that the weight does not 
depend on the direction of the head's motion. 

Next, we consider how to define T^{S'\S) so that it 
satisfies eq. (2.43). Three cases must be considered; (i) 
the scattering of the head at the vertex, (ii) the pair 
creation, and (iii) the pair annihilation. We first look 
into the case (i). For the scattering process, (2.43) can 
be written as 



where 



T^{S'\S)w{Su+v^) = T^{S\S')w{S'^+J, (2.45) 



w{Su+w) = w{Su)w^{S^)- 



Here, Sw is the local state around the head and Su+w 
stands for {Su,Sw)- Remember that there are only four 
possible final states for each initial state. Suppose that 
Su^ is the initial loc;al state; of the vertex. The state Su"' 
is obviously one of the four possible final states because 
if the head turns back at the vertex, the state Su^ is the 
final state. Let us denote the inverse of the other three 
possible final states as and sij^\ Then, the 

four states si^^ (k = 1,2,3,4) form a closed set, i.e., if 
the initial state is among the four the final state is always 
the inverse of one of the four. Therefore, cq. (2.45) can 
be generally decomposed into several closed sets of four 
equations. 

In order to find a solution to one of these quartets, 
let us suppose that a matrix Wij exists and satisfies the 
properties 



(=tl;(5«))=^ 



and 



Wi-j = Wi 



(2.46) 



(2.47) 



Then, it is easy to verify that 

t,,(=T^(5W|5W))^^ 



satisfies the property (2.45). Therefore, the problem of 
solving (2.45) has been reduced to finding a symmetric 
matrix that satisfies (2.46) with given Wi. 

The following solution is always available for any 
model: 



WiWj 



(2.48) 



The final state is chosen simply proportional to the 
weight of the final state if we use this solution; hence the 
name "heat-bath" type solution. However, it has been 
known that this solution yields an inefficient algorithm 
in many cases. 

In (2.46), we have ten free parameters and only four 
equations. However, the bounce-free condition wu = 
is often imposed for obtaining better efficiency. In the 
case of the quantum s = 1/2 XX Z model, in partic- 
ular, the solution becomes unique with this additional 
constraints. Still, we have six free parameters left. While 
little is known about the general principle for obtaining 
solutions that lead to efficient algorithms, good solutions 
are known for many important cases. In the next sec- 
tion, we show such a solution for the XXZ quantum 
spin model with an arbitrary s. 

As for the pair creation/annihilation process, we have 
to consider the detailed balance between a state with a 
worm and a state without. Specifically, the relation 

T{S'\S) ^ T{S\S')w^{S[,)w^{S[) 

must hold for the transition probability T where S' and 
5* arc the states with and without a worm, respectively. 
Note that S and S" arc identical except that S" contains a 
worm. The symbols S'^^ represents the local state around 
the head, just before the collision of the head and the 
tail, whereas S[ is the state around the tail. It should 
be noted here that the creation of the worm consists of 
two steps; the selection of the position of the creation 
and the rejection/acceptance of the proposed creation. In 
the discrete-time representation there are NL positions 
at which we can place a worm. Therefore, denoting the 
acceptance probability for the creation by -Pcreatei we can 
write T{S'\S) as 

T{^S'\S) = -^^r^-Pcreate('S'v), 

where the is the local state around the proposed point 
of creation before the creation. On the other hand, there 
is no position selection in the annihilation process. There- 
fore, 

T{S\S') = -Pannihilate(5'-y). 

(Note S'^ = Sy.) The detailed balance condition becomes 
1 



NL 



-Pcreate('S'u) — -Paniiihilate('S'u)u'w(S'ii)Ww('S't)- 



This yields the choice of the acceptance probabilities 

-Pcreate = min(l, R), Pannihilate = min(l, i?~^) 
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Fig. 12. The four initial states of the worm. Circles denote the 
heads and squares the tails. A spin state on a dashed line is 
lower than that on a solid one. In the case s = 1/2, in particular, 
a solid line and a dashed line correspond up and down spins, 
respectively. An arrow indicates the direction of the head's initial 
motion. 



with 



R = NLw^{Sl,)w^{Si). 



In particular, when the worm, weight is the matrix ele- 
ment of Sf, we obtain 

As we did in § 2.9, we can use rj for adjusting the transi- 
tion probabilities. In general, we should choose rj so that 
none of the transition probabilities is too small. If the 
worm weight does not depend on the local state, as is 
the case with the s = 1/2 and s = 1 spin systems, we 
can choose the free parameter rj so that R = I. which 
is obviously the optimal choice. In general, however, no 
such choice exists and the creation probability and/or 
the annihilation probability is smaller than 1 at least in 
some cases. In § 2.11, we present an example of the choice 
for the XX Z model. 

It may be useful to consider here the case of the 
5=1/2 XX Z spin model to make the description con- 
crete. In this case, the pair creation/annihilation is sim- 
ple as discussed above. The pair creation is always ac- 
cepted at any proposed position and the pair annihila- 
tion takes place whenever the head meets the tail. When 
the worm is created at a point where the local spin is 
up, the upper discontinuity point is positive where the 
lower one is negative (see Fig. 12). For a point with a 
down spin, the types of the created worm should be the 
opposite. The vertex density, as stated at the beginning 
of the present subsection, is the negated diagonal matrix 
element of the pair Hamiltonian. For example, it is J/2 
for the antiferromagnetic Heisenberg model. The proba- 
bilities that governs the scattering of the head at vertices 
can be derived from solving the quartets of the equations 
discussed above. The result depends on the anisotropy. 
The solution is presented in § 3 for various cases. The re- 
sulting algorithm is rather simple for the antiferromag- 
netic Heisenberg model; whenever the head hits a vertex 
we let it make the horizontal scattering. 

In the original paper by Syljuascn and Sandvik,^ we 
can find a good example that shows the utility of the 
directed- loop algorithm. In Fig. 14 (b), the integrated 
auto-correlation time defined for the magnetization is 
shown (the middle panel) as a function of the magnetic 
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Fig. 13. Assignment of vertices and a cycle of the worm update. 
Kinks are indicated by solid horizontal lines whereas other ver- 
tices are indicated by dotted lines. First, all the existing non- 
kink vertices arc removed while new vertices arc assigned (1 to 
2). Then, a worm is created (3). The head starts moving and it 
changes the local spin state (4). Every time the head hits a ver- 
tex, one of four possible events in Fig. 10 is chosen stochastically. 
In this figure, both the scatterings happen to be the horizontal 
ones (5 and 6). When the head comes back to the original po- 
sition, it annihilates with some probability leaving a worm-less 
configuration (7). The cycle (such as the one from (3) through 
(7)) is repeated a number of times before the vertices are up- 
dated. 



field. The magnetization itself (a) and the average loop 
size (c) are also shown in the same figure. As has been 
discussed above, the presence of the magnetic field com- 
peting against the exchange couplings makes the con- 
figuration freeze in simulations with the loop algorithm. 
As a result, it is impossible to observe a magnetization 
curve such as Fig. 14 (a). By using the directed- loop al- 
gorithm, one can obtain the curve within a reasonable 
amount of computational time. However, the difficulty 
has not been completely removed as can be seen in Fig. 
14 (b). The figure shows that the auto-correlation time 
diverges between two successive plateaus in the magne- 
tization curve. So far, a solution to this problem is not 
known. 

2.11 Coarse- Grained Algorithm 

In general, the solution of the time-reversal-symmetry 
equation (2.43) is not unique. In addition, the choice of 
the worm weight is arbitrary. However, the efficiency of 
the resulting algorithm largely depends on these choices. 
While one can obtain the solution by solving the equa- 
tion (2.43) numerically in general, there is no automatic 
way to choose a good one. It is up to the practitioner's 
physical insight, experience, and, to a certain extent, luck 
to find a solution and worm weights that lead to an ef- 
ficient algorithm. Therefore, it is worthwhile to present 
some efficient solutions for models of particular impor- 
tance. We here consider the XX Z model with general s. 
For this model, a set of simple formulas for such solutions 
are known. "'^ It includes the single-cluster variant of the 
loop algorithm for the s = 1/2 case. Therefore, the al- 
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Fig. 14. The magnetization (a), the integrated auto-correlation 
time (b), and the average length of the loop (i.e., the number 
of the visited vertices) (c), plotted against the magnetic field 
h/J for the s = 1/2 antiferromagnetic Heisenberg model in one 
dimension. The solid and open circles are for /? = 64 and /3 = 128, 
respectively. The linear dimension of the system is 64 lattice 
spacings. (Adopted from Syljuasen and Sandvik.*) 



gorithm can be viewed as a natural generalization of the 
loop algorithm to the cases with larger spins and with a 
uniform magnetic field. While the solution was found in 
a way quite different from solving the time-reversal sym- 
metry condition (2.43), we can show that the resulting 
solution satisfies (2.43).^*' Below, we briefly describe the 
procedure for obtaining the solution. The explicit for- 
mulas for the head-scattering probability and the vertex 
density of the XX Z model are presented in § 3. 

The idea is based on the split-spin representation. As 
discussed in §2.6, it is in general possible to reformulate 
the model with s > 1/2 in terms of the 2s Pauli spins: 
Si =^ ""i.M- would obtain the algorithm in which 
a head moves around in the space-time that consists of 
2s vertical lines for each site. What, then, would happen 
if we look at the real-time animation of the simulation 
on a low-resolution monitor? The 2s lines are blurred 
and they appear to be a single thick line. In the blurred 
image on the monitor, we cannot tell on which one of 
2s lines, namely n, the head is. The only that we can 
tell is on which site, i, and at what time, r, it is. Simi- 
larly, we cannot tell on which one of 2s lines a particular 
vertex is footed while we can tell the site and the time. 
Suppose also that the single line in the blurred image 
look brighter when we have more up-spins in the 2s lines 
in the original image. Then, there are 2s -I- 1 levels of 
brightness distinguishable in the blurred image. As the 
head moves, it changes the brightness level of the line by 



It was pointed out''^ that such a blurred animation 
can be generated with a set of transition matrix defined 
directly in terms of the brightness, without constructing 
the original sharp image. We should note that we only 
need the blurred animation for our original purpose to 
compute various physical quantities. In short, the split- 
spin representation is not necessary for describing the 
algorithm or writing computer codes while it is useful in 
deriving them, as we see below. 

To see how the head-scattering probability can be de- 
rived, let us consider the general s = 1 antiferromagnetic 
Heisenberg model for example. Suppose that the head 
has just hit a vertex that is in the state Su (the first di- 
agram in Fig. 15). The probability of obtaining the last 
diagram as the final state of the scattering can be given 
as 

T{S'js^) = Y,T.^is'u\KmK\^u)T{i:u\Su). 

(2.49) 

The symbol T(i;„|S'„) is the probability that the origi- 
nal (sharp) image of the blurred image Su is S„. It is 
proportional to the weight of the original image, i.e., 

w(T,u, l)A(S„,S'u) 



T(S„|5„) 



where A(S„,S'„) = 1 if and only if Su is the blurred 
image of The weight w(S„, 1) is the one in the split- 
spin representation, 

The second factor T(E^|S„) in (2.49) is the scattering 
probability in the split-spin algorithm, i.e., the scatter- 
ing probability in the case of s = 1/2. The third factor 
T(5'(,[S^) only represents the compatibility of the final 
state S'u with its original image S^, i.e., 

T{s'jK) = m'u,s'u)- 

It should be noted here that we do not explicitly in- 
troduce the worm weight. In fact, it was pointed out^^ 

that the present algorithm agrees with the directed-loop 
algorithm that follows from a special solution to (2.46) 
with the choice of the worm weight: w{S^) oc (tp'^lSfltpx) 
(x = h,t). 

The worm creation/annihilation probabilities can also 
be obtained from the blurring (or coarse-graining). In 
what follows, we express the local spin state by an inte- 
ger I = 0, 1, 2, • • • , 2s, which corresponds to the 2s -f- 1 
eigenvalues of Sf, — s, — s -|- 1, — s -|- 2, • ■ • , +s, respec- 
tively. In the split-spin representation, we choose a point 
uniform-randomly from the space-time, and if the local 
spin state at the chosen point is up, we place a positive 
discontinuity point above the negative one. We do the 
opposite if the local spin state is down. When coarse- 
grained, this yields the following; when the local spin 
state at the chosen point is /, the probability of creating 
a positive discontinuity point above the negative one is 
I /2s. For the worm annihilation, if a positive discontinu- 
ity point is above a negative one before the "rendezvous" 
and the spin state between the two is I, the probability 
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Fig. 15. The derivation of the scattering probability of the head 
in the "blurred" algorithm. Su and S'^ are the initial and the 
final states in the blurred image whereas S„ and are the 
corresponding states in the original sharp image. (The suffix u 
is dropped in the figure for clarity.) The numbers represent the 
"brightness" of the line. 



that the two are on the same Une in the spUt-spin repre- 
sentation is (2s — Therefore, the annihilation takes 
place with the probability {2s — l)~^ in the coarse-grained 
algorithm. If the relative location of the head and the tail 
is the opposite, the probability is l^^. 

Finally, the vertex assignment density can be derived 
as follows. Let us consider an interval in which a local 
spin state is I on one of the two sites and m on the other. 
In the original (split-spin) image, we assign vertices with 
the density (c-^, (T^^|(-7ij^j>)|cri^, CTjy) between the two 
vertical lines specified by (i/i) and (jV). Therefore, in the 
blurred image, we assign vertices with the density 

= lmp++ + lfhp-\ +lmp-+ + lfhp , 

where p±± is the vertex density for the s = 1/2 model 
with the local spin state (±i, ±i). 

Below we see an example that shows the efficiency 
of the coarse-grained algorithm. Although the algorithm 
can be applied to an arbitrary s, we only show the case 
for the s = 1 antiferromagnetic Heisenberg chain in a 
uniform magnetic field. This model has the freezing prob- 
lem when simtilatccl with the loop algorithm, and it was 
one of the primary motivations for developing the coarse- 
grained algorithm. In Fig. 16, we can see the performance 
of the coarse-grained algorithm. For comparison, we ex- 
ploited the degrees of freedom in the time-reversal sym- 
metry equation and obtained many solutions. Algorithms 
1-4 in Fig. 16 are the ones chosen (in an ad-hoc manner) 
from them. (See the paper "^"^ for how these were chosen.) 
Plotted in Fig. 16 is A{M^)Ny^'^ /L, where A{M^) is the 
estimated statistical error of the squared staggered mag- 
netization and Ny is the average number of the vertices 
visited by the head during its lifetime. Here (only in this 
paragraph and in Fig. 16) L is the system size, not the 
Trotter number or the order of the expansion. Since the 
scattering process is the most time-consuming part of the 
code, the total CPU time is roughly proportional to the 
total number of scattering events of heads, including the 
"straight" scatterings. Therefore, the CPU time is pro- 
portional to Ny. This is why the statistical error should 

1/2 

be multiplied by Ny in order to make the comparison 
fair. In Fig. 16, we can clearly see that the coarse-grained 
algorithm performs as well as the best algorithm among 
the the other four (i.e., Algorithm 1). Obviously, there is 
no exponential slowing-down in the coarse-grained algo- 
rithm and Algorithm 1, as was the case with Syljuasen 
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Fig. 16. The statistical error in the estimate of the squared stag- 
gered magnetization multiplied by the square root of the average 
number of scattering events during the lifetime of a worm. The 
"present" algorithm is the coarse-grained algorithm described in 

the text. The system is the s = 1 antiferromagnetic Heisenberg 
chain of the length of 64 lattice spacings in a uniform magnetic 
field H = 0.1. Each point is a result of 50 sets of simulations, 
where each set consists of 20000 creations and annihilations of 
worms. (Adopted from Harada and Kawashima.*^) 



and Sandvik's solution for s = 1/2. 

2.12 Algorithms for Bosons 

In this section, we present an algorithm for simulating 

bosonic systems. The algorithm"*^ is based on mapping 
of bosonic models to spin models and the coarse-graining 
discussed in §2.11. The result is similar to the worm al- 
gorithm, as we see below. While the ordinary directed- 
loop algorithm can also be used for the boson models 
directly, a problem arises from the fact that the boson 
occupation number is unbounded in general. An artificial 
bound must be set to make the resulting solution to the 
detailed balance equation (2.46) meaningful. The limi- 
tation is, however, undesirable since the range of values 
that the occtipation number takes on in the equilibrium 
is not known a priori. While this is not a serious prob- 
lem in a uniform model where a typical valuc^ as well as 
a typical fluctuation in the occupation number is known, 
it can be serious in some cases, such as the soft-core bo- 
son model with random chemical potential; the typical 
occupation number may largely vary from site to site in 
the inhomogeneous potential. The algorithm presented 
below is free from this problem. 

In order to explain the idea, we consider a sim- 
ple modc;l of non-interacting soft-core bosons on a d- 
dimensional hyper-cubic lattice. The Hamiltonian is 

^ = '1 + ^-^0 - mE ^1^- (2-50) 

(y) ^ 

where t is the (positive) hopping amplitude, p is the 
chemical potential, and b] and bi are the boson creation 
and annihilation operators, respectively. In addition, the 
chemical potential must satisfy p < —dt. In order to map 
the boson model to the spin model, we use the Holstein- 
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Primakoff transformation,^^ 

S! = blh-S, 

where 5*^^, S~ and 5'f are spin operators on the ith site. 
With this transformation, the model of (2.50) is approx- 
imately transformed to an XY spin model, 



45 



(2.51) 



(ij) 



In the limit of infinite s, this mapping is exact. There- 
fore, if the infinite s limit of the coarse-grained algorithm 
of the spin system exists, it serves as an exact algorithm 
for the boson system. In the following, therefore, we con- 
sider the infinite s limit of the coarse-grained algorithm 
discussed in § 2.11. 

We first consider the beginning and the ending of a cy- 
cle; the creation and the annihilation of a worm. In the 
coarse-grained algorithm, a spin-lowering worm (i.e., the 
positive head (tail) above the negative tail (head)) is cre- 
ated with the probability l/2s. Here, / = 0, 1, 2, • • • ,2s 
specifies the local spin state, which corresponds to the 
number of bosons in the bosonic algorithm presented be- 
low. Since the number of particles is finite, the proba- 
bility is zero in the limit of infinite s. Therefore, we al- 
ways start a cycle with a spin-raising or boson-creating 
worm (i.e., the negative head (tail) above the positive tail 
(head)). On the other hand, when the head meets the 
tail, it may annihilate with its partner or simply pass 
it. The probability of the annihilation depends on the 
type of the head. If the head is of such a type that its 
passage increases the occupation number by one, i.e., if 
it is positive and moving downward or if it is negative 
and moving upward, then the annihilation probability is 
l/(2s — /) where / is the local spin state between the 
the head and the tail just before they come to the same 
location. The probability is zero in the infinite s limit. 
Therefore, the annihilation takes place only for a head 
whose passage decreases the occupation number by one, 
and it happens with the probability l/l. 

Next we consider the vertex assignment and the scat- 
terings of the head at the vertices. Since the density of 
vertices are proportional to s, at first glance, assigning 
the vertic;es in the coarse-grained algorithm is impossible 
in the limit of infinite s. However, the head goes straight 
through most of the vertices. The probability that the 
head changes the direction of motion at a vertex is pro- 
portional to 1/s. Therefore, the density of real scatter- 
ing events, which is the product of the density of vertices 
and the scattering probability at a vertex, remains finite. 
With this density of the scattering event, the imaginary 
time at which the next scattering happens can be gen- 
erated by a Poisson process, similar to what we do in 
taking the continuous-imaginary-time limit in the loop 
algorithm (see § 2.5). In this way, we can make the head 
move and scatter with a finite number of procedures. (See 
§ 3.3 for the details of the procedure.) 

In Fig. 17, the result of the numerical simulation using 
the present algorithm is shown together with the exact 
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Fig. 17. The superfluid density plotted against the average occu- 
pation number for the three-dimensional free lattice-boson sys- 
tem. The lines are the exact analytical values. (Adopted from 
Smakov et al.'*'^) 



result. Plotted is the superfluid density at tp = 2 
as a function of the average occupation number n. The 
transition point is around ~ 0.6. With the present 
method, there is no major difficulty in performing sim- 
ulations near the critical point as can be seen in Fig. 
17. Although not shown in the figure, we also tried a di- 
rect application of the directed-loop algorithm with the 
heat-bath- type solution discussed in § 2.10 to the present 
problem. For the reason mentioned at the beginning of 
the present section, we have to set the upper bound for 
the occupation number. When we set it to be 20, which is 
close to the minimal to perform a non-biased simulation, 
we observed that the simulation becomes very slow at 
n ~ ric- It was practically impossible to do a simulation 
when the occupation number exceeds the critical value. 

2.13 Negatwe-Sign Problem and Meron Algorithm 

The negative-sign problem is unarguably the worst ob- 
stacle in numerical simulations of quantum models. It is 
originated in the negative matrix elements of the Hamil- 
tonian. When some of the off-diagonal matrix elements 
of the local Hamiltonian Tlu are negative, in general the 
weight (2.14) can be negative for some of the states S. 
In such a case, we perform a Monte Carlo simulation of 
which the target distribution is |W^(S')|, not W{S). Then, 
we estimate an arbitrary quantity Q using the identity 

j:s\W{S)\sgn{S)QiS) 



(Q) 



thermal 



Es\WiS)\sgniS) 

j:smS)\sgn{S)Q{S)/^smS)\ 
EsmS)\sgn{S)/j:s\W{S)\ 

(sgn(5)g(5))MC 



;sgn(5)) 



MC 



(2.52) 



where sgn(S') = ±1 is an abbreviation of sgn(VK(5)) 
and (• • • )mc is the Monte Carlo average with the weight 
\W{S)\. 

The negative contribution to the partition function, 
Z-, cancels out with a part of the positive contribu- 
tion, Z-^-, and the total Z = Z+ — Z- must be always 
positive. In fact, in many cases, the negative contribu- 
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tion almost completely cancels out the positive one. This 
can be seen*^ by considering a fictitious Hamiltonian H', 
whose matrix elements are the absolute value of the cor- 
responding matrix elements of the original Hamiltonian, 



(V^'IWIV) (^' = V') 
\{^|J'\7^m (^'^v) ■ 

The difference between the free energy per site A/ = 

{F' - F)/N is of 0{N^)., when the difference between 
the two Hamiltonians is extensive. It follows that 

Z 



Z+ - Z_ 



z+ + z_ 

where Z' is the partition function of the fictitious sys- 
tem. It is clear from this expression that the positive 
and the negative part cancel out almost completely at a 
low temperature and/or for a large size, and it becomes 
practically impossible to estimate the denominator (as 
well as the numerator) in (2.52) because of the statisti- 
cal error. 

In some special cases, the loop algorithm is useful for 
solving the negative-sign problem in fermionic systems. 
Considering that there are not many cases where the 
negative-sign problem is overcome with or without the 
loop algorithm, it may be worthwhile to mention here 
the meron algorithm^'^' -"^-^ by which we can overcome the 
negative-sign problem in some cases. Let us consider the 
simplest non-trivial model for the fermion, 



n = 



-t 



(ij) 



{c\cj + c]ci) 



+C/^(ni-l/2)(n^-l/2), 
(ij) 

where t > and U > 2t. The spin degrees of freedom 
are absent. We first consider the case U = 2t and then 
describe how to generalize the algorithm to the case U > 
2t. 

Apart from the fermion sign due to the exchange of 
particles, when U = 2t, this problem is identical to the 
s = 1/2 antiferromagnetic Heisenberg model (eq. (2.24) 
with Jx = Jy = Jz)- Therefore, the graph decomposi- 
tion of the Boltzmann weight can be done only with the 
horizontal graph gn (See Table I). Using the world- line 
representation discussed above, we can express the par- 
tition function as 



z = Y,^m{S)WL{S). 



(Here, the absolute value of the weight in (2.14) is sim- 
ply written as Wl{S).) The sign is positive if and only 
if the world-lines of the fermions corresponds to an even 
permutation of particles. The weight Wl{S) is the same 
as the one for the s = 1/2 antiferromagnetic Heisen- 
berg model. Applying the graphical decomposition of the 
weight(2.19), we can rewrite it as 

Z = Y,^Er^{S)WL{S,G). 

S,G 

Let us choose an arbitrary G and fix it, then consider 
a loop variable ai = 0, 1 assigned to every loop I in G. 



(We can specify a state S compatible to G by specifying 
the values of these variables.) The following properties 
can be shown for the sign of the compatible states to G: 
(i) The sign sgn(S') can be expressed as a product of 
factors each of which is a function of only one of the loop 
variables, and (ii) if the sign sgn {S) does not depend on 
the choice of the loop variables, sgn (5') = 1. 

The example of the factorization (the property (i)) is 
shown in Fig. 18. There are two loops in the middle di- 
agram, c and d, whose flipping changes the sign of the 
whole configuration. Note that these loops always cause 
a change in the sign no matter what the initial state may 
be. On the other hand, all the other loops do not cause 
any sign change. Generally, whether a fiip of a loop causes 
a sign change or not depends only on the geometric fea- 
ture of the loop, not on the initial spin configuration. 
Therefore, one can determine if a given loop may cause 
the sign-change or not without knowing the state S. The 
loops that cause sign change are called merons. In Fig. 
18, there are two merons, c and d. 

It is easy to show the positivity of the state with no 
merons (the property (ii)). It suffices to show that there 
is at least one positive configuration among the ones com- 
patible to G. But the sign of the Neel state is positive 
and the Neel state is compatible with any G. To see the 
latter, let us notice that the constraint imposed by any 
graph is simply that when we trace a loop the local spin 
state must alternate everytinic; we make the horizontal 
(spatial) move. The Neel state obviously satisfies this 
condition and is compatible to any G. 

Because of the two properties, the sign can be ex- 
pressed as 

I 

where ei = —1 for merons whereas e; = 1 for the other 
loops. The variables denoted as ai are the loop variables; 
each of them takes or 1 . They are defined so that u; = 
for all the loops I if the whole system is in one of the two 
Neel states. (The choice of the Neel state does not matter 
because there are always an even number of merons.) The 
whole partition function now becomes 

where S{{<ti}) is the state specified by the loop variable 
(Tj. But the summation over {ai} is zero if there is a 
meron in G. Therefore, the summation over all G can be 
replaced by the summation over all meron-free G; 

y(G)2^°(^) 



G 

(no meron) 



where 



ViG)^Y.^{S{{ai}),G). 



Wi} 

Thus, it becomes apparent that we can avoid the negative 
sign problem if it is possible to perform a simulation in 
the restricted phase space of the meron-free graphs. 

This can be achieved in the following way. Every time 
insertion or removal of a graph element is attempted, 
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Fig. 18. A loop decomposition (from the left to the middle) and a 
loop flip (from the middle to the right) in a four-spin chain of the 
s = 1/2 antiferromagnetic Heiscnberg model with the periodic 
boundary condition. The initial spin configuration is the perfect 
Neel state. The only non-trivial graph elements are the horizontal 
graphs in Table I. (All the graphs are drawn with solid lines for 
clarity whereas dashed lines are used in Table I for emphasizing 
the relative spin orientations) The loops, c and d, in the middle 
diagram are merons. From the middle to the right, the loop d is 
flipped. As a result, the two particles are exchanged in the left 
diagram, which makes the sign of the whole diagram negative. 



we check whether the attempt wotild create merons. If 
it would, such an attempt is rejected. The actual inser- 
tion or removal is executed only when it does not cre- 
ate merons. This checking procedure requires a relatively 
large amount of computational time and raises the com- 
plexity of the algorithm. This additional complication, 
however, is of the algebraic type at most and pays off 
considering the exponential complexity of the simulation 
with negative signs. In an actual simulation, because of 
the necessity of computing the susceptibility, we soften 
the condition of no merons. Namely, we need to sam- 
ple the graphs with two merons in order to estimate the 
susceptibility. Therefore, sampling of graphs is usually 
done for two-meron graphs as well as meron-free graphs. 
In this case, the insertion and the removal are rejected 
only when an attempt is made to create the third meron. 
For the same purpose, instead of placing a strict upper 
limit in the meron number, we can also use an extended- 
ensemble method with respect to the meron number, in 
which we consider a fictitious weight Ty(nineron) for con- 
trolling the meron number. (For the extended-ensemble 
method, see §2.15.) 

The algorithm can be easily generalized to the case 
where the easy- axis anisotropy exists, i.e., the case where 
U > 2t {Jz > Jx)- In the easy-axis case, we have to 
introduce the binding graph ^hb that bind two loops into 
one cluster. (See Table I.) For instance, if two merons 
are bound they form a non-meron. Insertion or removal 
of these binding graphs is performed in the same way as 
above; first count the number of new merons that would 
be created or annihilated by the attempt and then accept 
or reject the attempt according to the restriction or the 
fictitious weight mentioned above. 

2.14 T = Simulation 

The ground state properties are of particular inter- 
est in low- dimensional models. While the ground state 
properties can be deduced, in principle, by extrapolating 
the finite temperature results to zero temperature, it is 
prone to the systematic error due to the extrapolation 
especially when we do not have a solid knowledge about 
how low the temperature must be to obtain a reasonable 



extrapolation. Evertz and von der Linden''^ proposed a 
method for directly obtaining the zero-temperature ex- 
pectation values for a given model without extrapolation. 
The method is a special application of a more general 
method (proposed also by them) for obtaining results 
for an infinite lattice. (Note that a system at zero tem- 
perature is regarded as a system with an infinite length 
in the imaginary time direction.) Since this idea can be 
most clearly described using an example of the ferromag- 
netic Ising model, let us consider this case first. 

The method is closely related to Wolff's single-cluster 
algorithm. In Wolff's single-cluster algorithm, we start 
the construction of a cluster from a null cluster that con- 
tains no spin. Then, we choose a spin at random from the 
whole system and include it in the cluster. Next, we as- 
sign bonds between the spins sorrounding the cluster and 
the ones already in the cluster in the same way as the SW 
algorithm. If a new spin is bound to a spin that is already 
in the cluster, the new spin is also included in the cluster. 
This procedure is continued until there is no surround- 
ing spins to be checked. When the cluster construction is 
finished, the spins in the cluster are flipped. The Wolff's 
procedure is statistically the same as constructing clus- 
ters by the SW algorithm for the whole system, choosing 
a point at random, and then flipping the cluster contain- 
ing the chosen point. However, the Wolff's procedure is 
obviously better in efficiency for constructing a single 
cluster because bond assignments to the spins outside of 
the formed cluster are a waste of time. 

Roughly speaking the method we discuss here is the 
Wolff's procedure applied to an infinite system. The dif- 
ference arises from the fact that choosing a point at ran- 
dom from an infinite system does not make much practi- 
cal sense. Therefore, we simply stick to the same point, 
which we call the origin, throughout the simulation in the 
new method. In addition, we have to make sure that the 
cluster does not percolate, since if it does we cannot fin- 
ish the cluster construction within a finite computational 
time. To this end, we starts from the Neel state as the 
initial spin configuration. Due to this choice of the initial 
spin configuration, the first round of the cluster construc- 
tion is destined to be very short since we can assign no 
bond that connects the origin to its nearest neighbor. 
After flipping this flrst cluster, i.e., the one that consists 
of the origin only, we start the second round. We can 
go a little further this time. Since the spin at the origin 
is now aligned with its neighbors, we assign bonds be- 
tween them with a non-zero probability. As a result, the 
cluster that we obtain from the second round tends to 
be larger than the first one. In this way, we can go on. 
It should be noted that we only have to store the spin 
configuration of the part that is modified in the simu- 
lation at least once and do not have to keep the spin 
configuration beyond the frontier. It must be also noted 
that the frequency of visiting a site whose distance from 
the origin is R is proportional to the correlation func- 
tion T{R) = {S{R)S{0)). Therefore, the 'already visited' 
region does not grow too fast after its size reaches the 
correlation length. This means that the whole process is 
manageable as long as the system is in the disordered 
state. Since the distribution of the spin configuration 
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within the region approaches the equihbrium one, one 
can in principle compute any correlation function that 
can be defined within this region. 

The procedure can be easily generalized for studying 
the disordered state of the quantum system. For exam- 
ple, in the case of the s = 1 antiferromagnetic Heisenberg 
model, which is disordered even at T = due to the Hal- 
dane gap, let us consider the split-spin loop algorithm. 
The initial spin configuration is the complete Neel state 
where the up-spin straight world-lines and the down-spin 
ones alternate. Initially, only the local spin configuration 
at the origin is stored in the computer memory. The head 
is placed on the origin and the direction of the initial mo- 
tion is set to be upward or downward (with probability 
1/2). We then generate the scattering object in this di- 
rection at some stochastically chosen distance from the 
origin. The distance is generated following the Poissonian 
process with the density J/2 for each nearest neighbor 
site, as we have scien in §2.10. At the collision, the head 
changes its direction of motion as well as the location, as 
it does in the directed-loop algorithm. The same proce- 
dure is repeated until the head comes back to the origin. 
The spin configuration is stored only for the part that 
has been visited by the head. 

When the system has zero energy gap, a naive applica- 
tion of this method would fail, since the 'already visited' 
region would glow endlessly. This can be avoided, how- 
ever, by using a system whose spatial size is finite while 
its temporal length is infinite {(3 = oo). This is because 
there is usually a finite gap associated with the system's 
spatial dimension. Therefore, we can at least perform a 
zero-temperature simulation for finite systems. This is 
still somewhat advantageous compared to ordinary sim- 
ulations for which one needs two kinds of extrapolations, 
the one with respect to the size and the other with re- 
spect to the temperature. 

2.15 Extended- Ensemble Methods 

While extending the state space by introducing the 
auxiliary variables G is a powerful way of speeding up the 
simulation as we have seen, there is another general strat- 
egy for efficient simulations. That is, we can overcome a 
slow relaxation by using a fictitious target weight, W{S) 
instead of the given weight W{S). Simulation methods 
based on this idea are called extended-ensemble meth- 
ods. In most of the extended-ensemble methods, one does 
not fix the target weight W{S) but use it as an ad- 
justable function. One of the successful examples is the 
multi-canonical Monte Carlo (MCMC) method. ^^"^^ In 
the MCMC method, the weight depends on the state S 
only through the energy E{S). Therefore, we have 

W{S) = w{E{S)), 

and the function 'w{E) is adjusted adaptively to make 
the frequency of having the event E = E{S) indepen- 
dent of E. To this end, several sets of simulation may be 
performed. For the first set, the initial guess of the appro- 
priate weight that is often used is 'w{E) = const. In every 
set, the histogram h{E), i.e., the number of times E{S) 
takes the value E, is recorded. In other words, every time 
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the configuration is changed we do 

h{E{S)) := h{E{S)) + 1. (2.53) 

At the end of each set, the fictitious weight w{E) is up- 
dated as 

w{E) := w{E)lh{E), 

and the new weight is used for the next set. In each 
set, the Metropolis single-spin- flip Monte Carlo method 
is used with the target weight W{S) = w{E{S)) for up- 
dating the spin configuration. 

At the end of the last set, one can obtain the density 
of states g{E), as 

9{E) = Y^5e,e{S) ~ const x {w{E))-'^. 
s 

With this g{E), we can compute the canonical average 
of an arbitrary quantity Q as 

Q{T) = ^ g{E)w{E)Q{E) j ^ 9{E)w{E), 

E E 

where Q{E) is the micro-canonical average of Q at the 
energy E, which we can compute in the last set of simu- 
lation. 

While this procedure turned out to be quite useful in 
studying various systems with slow dynamics, one draw- 
back was noticed. That is, the visited range of E do not 
widen much in each set of simulation, partially due to 
the poor initial guess of the weight 'w{E) and also to the 
slow diffusive nature of the random walk in the energy 
space. This drawback was removed in Wang and Lan- 
dau's variant of the MCMC method. In their method, 
w is updated every time an attempt is made at changing 
the spin configuration. The update is done as 

w{E) := r X w{E) (2.54) 

where < r < 1 is the reduction factor fixed through- 
out each set of simulation. At the beginning of each 
set, the histogram is reset, i.e., h{E) := for all E, 
and the reduction factor is updated as r := r" where 
< a < 1, while the weight 'w{E) is kept unchanged. 
Each set does not have a prefixed duration, but is termi- 
nated when the histogram becomes approximately flat. 
Since the dynamic update (2.54) strongly penalizes the 
random walker's staying at the same value of the energy, 
the already visited region widens much faster than the 
ordinary MCMC. 

Since these extended-ensemble methods are comple- 
mentary to the algorithms described in previous subsec- 
tions, it is natural to consider the combination of the 
two. However, since the weight W{S) is not the function 
of the energy observable in a typical quantum Monte 
Carlo method, there is no good reason to assign a spe- 
cial role to the energy observable in the quantum case. 
In addition, the value of the energy observable is not a 
multiple of a single constant, which is inconvenient for 
the present purpose. In the loop algorithm (and in the 
directed-loop algorithm), therefore, the Wang-Landau 
method was used^^"^^ with the histogram of the number 
of vertices (or graph elements), n{G) = G„, rather 
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Fig. 19. The free energy F, the entropy 5", and the specific heat 
C of the antifcrromagnetic Heiscnberg model in one dimension. 
The system's linear dimension is 10 lattice spacings. The relative 
error in the free energy e(F) = K-F— -F'exact)/-Pexact| is also shown. 
(Adopted from Troyer, Wessel and Alet.^®) 



than the energy. (The idea was originally proposed by 
Janke and Kapplcr'"'^ and applied to the Ising model.) 

Two types of the implementation are possible; in one, 
the pair Hamiltonian is decomposed graphically,^^ as in 
the loop algorithm, and in the other, it is used undi- 
vided^^ as in the directed-loop algorithm. We present 
below a continuous-time variant of an algorithm closer 
to the latter. 

We start from (2.15) with u being We control the 
histogram by introducing an adjustable weight /(n) and 
replacing (At)" = by /(n)/L" in (2.15), which 

yields 

Z = lim VvFl(5), 
Wl{S) = ^T^i(5,C?), 

G 

u 

We then perform a Monte Carlo simulation in the {S, G) 
space so that the detailed balance condition is satisfied 
with respect to the weight VKl(5, G). 

One sweep of the simulation consists of two steps, as in 
the ordinary directed-loop algorithm. In the first step we 
update G. To do this, we decompose the whole system 
into uniform intervals (UIs) with no kink. For each pos- 
sible local state, Su, we consider the class of the UIs on 
which the local state is Su- Let the sum of the lengths of 
all the UIs in this class be I{Su) or simply /. We have IL 
local units in this class. (Here we use the convention in 
which the total length of the lattice along the imaginary 
time direction is 1.) Our task, then, is to set variables 
Gu for all of these local units. To do this, we first remove 
all the existing vertices from the UIs. There arc /lC„i 
ways of assigning m vertices to IL units. Therefore, the 
probability of placing m vertices becomes 

IL \ f{no + m) ^^ 
m I L"^ ' 



P{m) oc 



currently chosen type of UIs, and no is the total number 
of vertices before placing m vertices. In the limit L ^ oo 
this leads to 

A ml 

where A is the normalization constant. The procedure 
of assigning graphs to the UIs follows directly from this 
expression. That is, we first generate a uniform random 
number r (0 < r < 1) and find the integer m that satisfies 

X{m - 1) < r < X{m), 

where X{m) = X^^/=o -^(™')- We then choose m points 
uniform-randomly from the intervals belonging to the 
current class and place non-trivial graph elements there. 
We repeat this procedure for all the classes. 

In the next step we update the spin configuration. 
However, this step can be done in exactly the same way 
as described in § 2.10 using worms. 

It should be noted here that in the procedure given 
above, the histogram updating (2.53) (with E{S) re- 
placed by n{G)) can be done only once in a sweep because 
there is no intermediate state in the procedure. On the 
other hand, in other methods such as the original algo- 
rithm proposed by Janke and Kappler,^^ the histogram 
updating can be done after every local assignment of a 
vertex. This difference would make the statistical error 
greater than it should be for the present method. This 
apparent disadvantage can be easily avoided by adding 
P{n) to the histogram, rather than or 1. Specifically, 
we should replace (2.53) by^'' 

h{n) := h{n)+P{n). 

The utility of the extended-ensemble methods can be 
demonstrated best in the computation of the quanti- 
ties directly related to the density of states, such as 
free energy, entropy and specific heat. These quantities 
can be computed easily with the aid of the extended- 
ensemble methods. In Fig. 19, some results"''*' of the com- 
putation with the extended-ensemble method is shown. 
The method used to obtained the results is based on 
the discrete-time formulation and the details are differ- 
ent from the one discussed here. However, the basic idea 
is the same and the efficiency is believed to be similar. 
Because of the cut-off (the maximum order of the ex- 
pansion), the results deviate from the exact results at 
low temperatures {T/J < 0.1) whereas they are indis- 
tinguishable from the exact results at high temperatures 
(T/J > 0.1) in the scale shown in Fig. 19. (Note, how- 
ever, that the accessible temperature range can be easily 
widen by modifying the cut-off.) 

It was shown recently^'' that a further improve- 
ment can be made by employing the broad-histogram 
method."'''''®*' Compared to the other extended-ensemble 
methods, this method is unique in that it is based on 
the exact relation between the expectation value of the 
transition matrix and the density of states. While one es- 
timates the density of states directly from the histogram 
itself in the ordinary extended-ensemble methods, the 
micro-canonical average of the transition matrix is used 



where w is the local weight w = (V'u|(— W„)|V'«) for the 
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in the broad-histogram method. This estimator gener- 
ally gives a better result than the histogram itself be- 
cause of its macroscopic nature. An improvement of more 
than one order of magnitude in the relative error was re- 
ported^^ in the case shown in Fig. 19. 

2.16 Estimators and Efficiency of Algorithms 

As discussed in § 2.2, the local updating algorithm is 
slow to approach the equilibrium because the size of 
the region modified at a time is fixed and can be much 
smaller than the correlation length. On the other hand, 
the size of the modified region is roughly equal to the cor- 
relation length (both in space and in time) in the three 
algorithms discussed in this article; the loop, the worm 
and the directed-loop algorithm. This can be seen by 
considering the estimators of Green's functions. In this 
section, we therefore discuss estimators of various quan- 
tities including Green's functions. 

Let us consider a quantity Q that can be decomposed 
in the same way as the Hamiltonian is in (2.11); 



Then, its canonical average can be expressed with the 
source term as 



iQ) 



dH 



Z{H) / Z{H) 



where 



(2.55) 

(2.56) 



Z{H) ^ Tr (^e-f'^n-HQ)^ ^ 
Accordingly, the weight (2.14) is modified as 

WUS) = [] |[1 - At {Hu - i?Q„)]| V«> • (2.57) 

u 

Substituting (2.56) for (2.55), we can express the thermal 
average of Q as the Monte Carlo average of the estimator 
Q{S) as 



with 



(Othermal = {Q{S))mC 



(2.58) 



q{Su) = 



(v;iAtq„iv«) 



In the limit of i — > oo, eq. (2.59) becomes 



(2.59) 



(2.60) 



Therefore, if the Q is expressed as a diagonal operator, 
the above equation is reduced to 



Q{S) 



1 

- drQir) [L ^ oo), 



(2.61) 



where Q{t) = {tp{T)\Q\tp{T)). For example, the magneti- 
zation Q = can be easily evaluated with this for- 
mula. In general, however, we cannot ignore the contri- 
bution of kinks {ij)'^ ^ '^„) in Q{S). For example, the con- 
tribution of a kink to the total energy (W) is 0(— 1//3). 



Next, suppose that the operator has a non-zero 
off-diagonal matrix element for some local state whereas 
the corresponding matrix element of the Hamiltonian is 
zero. In such a case, the estimator q{Su) diverges, and we 
cannot define Q{S) to start with. An example is the mea- 
surement of the squared magnetization in the transverse 
direction, Q = {J^i^f)^: with the Hamiltonian being 
that of the XX Z Hamiltonian. 

By using the loop algorithm, such non-diagonal quan- 
tities may be measured. * The idea is based on the im- 
proved estimator. An improved estimator is an estimator 
defined in terms of the graph degrees of freedom rather 
than the original spin (or occupation number) degrees 
of freedom. A classical example is Wolff's estimator^^ of 
the susceptibility of the ferromagnetic Ising model. An 
improved estimator can be generally derived as follows. 
Consider first the case where an ordinary estimator Q{S) 
can be defined for a quantity Q. The thermal average can 
be expressed as 

EsW{S)QiS) 



iQ) 



thermal 



_ J:s.gW{S,G)Q{S) _ J:gW(G)Q{G) 

where Q{G) is the fixed-graph average of Q{S): 

Q{G) = ^ W{S, G)Q{S) / W{G). (2.62) 
s I 
In the case of the magnetic susceptibility of the Ising 
model, we take Q{S) = A^"HEi -S'^K Then, eq. (2.62), 
with W{S,G) defined by (2.6) and (2.4), is reduced to 

Q(G) = 2-^<^(G)^A(5,G)Q(5), 
s 

where A{S,G) is defined in (2.9). Let us define clus- 
ter variables ctc so that Si ~ <yc{i) with c(i) being 
the specifier of the cluster to which i belongs. Then, 
Q(S) = N-'^J2t,j SiSj can be rewritten as Q{S) = 
N^^ ^, VcVc'(Tc(Tc' where Vc is the total number of 
sites in the cluster c. Therefore, we obtain 



Q(G) = iV-i^(K)^ 



(2.63) 



as the graphical estimator of the squared magnetization. 

A similar argument can be used for deriving an im- 
proved estimator of the uniform magnetic susceptibility 
of a quantum spin models, 

r/3 



Q = x. 



N- 



dT{M,{T)MM)- 



The ordinary estimator of this quantity is 

Q{S) = {Np)-^Q dXS^X)j , 

where X = {i,T) and the integral / dX stands for 
^ • J dr. Then, similar to the case of the Ising model, 
the improved estimator can be expressed by (2.63) with 
Vc replaced by the cluster magnetization M^, 



dX S'^iX). 
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Fig. 20. The spin configuration S and the graph G that appear 
in the computation of an off-diagonal Green's function. Discon- 
tinuities in the world-lines and in the graph at two points X and 
Y are tolerated. 



Now, let US consider the case where Q{S) cannot be 
defined, as is the case with off'-diagonal quantities such 

as the transverse susceptibility 

Q={N(3)-^rr I dXdYS''{X)S''{Y), 



where X = {i.r) and Y = {j,T') specify two points in 
the space-time. The symbol T indicates the time-ordered 
product. We can express its thermal average as 



Xx 



{Q) 



thermal 



{Npy 



I dXdYZ-^Y^W'{S) 



x(V;'(X)|Sf|V^(X))(V.'(F)|5||V^(y)), 

where ip'iX) and tpiX) are the states just above and be- 
low the point X , respectively. The prime in indicates 
that the summation is taken over all the states that have 
discontinuities at X and Y. Such a state is shown in Fig. 
20. The prime in W'{S) indicates that it allows the dis- 
continuities at the two points. (Note W{S) = when the 
state S has any discontinuities.) 

Then, we introduce graphs in the above expression to 
yield 

Xxx = {NP)-^ I dXdYZ-^Y^V{G)A{S,G)-^ 

•' S,G 

(1 - <^i/>(i,r-|-0)Xi,T-0))(l - <5i/>(i,r'-|-0)Xi,T-((^-64) 

We can take the summation over S. Note here that there 

is no way to assign local spin values along a loop so that 
the value is discontinuous at one and only one point in 
the loop. It means that the summation is zero unless 
the two points X and Y arc connected by the graph G. 
Therefore, the result of the summation becomes 



Xx 



■7 



dXdY 



l(X),liY) 



where 1{X) and 1{Y) are the specifiers of the loops to 
which the points X and Y belong respectively. The sum- 
mation in the numerator is over the set of graphs that 
yields a non-zero term in the summation (2.64), i.e., 
graphs that have at least one matching spin configura- 
tion with two discontinuity points. For most quantities 



and models, this coincides with the set of graphs over 
which the summation in the denominator is taken, i.e., 
the graphs that have at least one matching spin configu- 
ration with no discontinuity points. (In some pathological 
cases, however, this is not the case, and the present esti- 
mator does not work in such cases.) If such is the case, 
the above expression can be simply rewritten as 



X=l{NP)-' 



(2.65) 



MC 



Thus we have obtained estimators of the ^-component 
susceptibility (2.63) for the Ising model and the x- 
component susceptibility (2.65) for the quantum model. 
In both cases, the estimator is expressed as the average 
cluster (or loop) size, Vc = ^c^cf / Ylic ^c- This means 
that the typical size of clusters correctly reflects the cor- 
relation length in the loop algorithm. This is the reason 
why the loop algorithm works very efficiently in reducing 
the critical slowing-down. 

The worm algorithm and the dirccted-loop algorithm 
are also efficient near the critical point for a similar rea- 
son. In these cases, the estimator of Green's functions is 
the frequency of the head's visiting a certain location. 
For example, in order to compute the correlation func- 
tion V{X,Y) = (S'^(X)S'^(y)), we have only to count 
the number of times the head passes the position X — Y 
(relative to the original point). This is quite natural, con- 
sidering that the trajectory of the head in the directed- 
loop algorithm is statistically identical to a loop in the 
loop algorithm in the cases where the two algorithms co- 
incide. In what follows, we see that the estimator is valid 
in general even when the dirccted-loop algorithm does 
not coincide with the loop algorithm. 

We start with eq. (2.34). When the worm weight is cho- 
sen as (2.44), the correlation function can be expressed 
as 



(Arr;)2r(X,F) = 



S':X,Y 



S':i\o worm 

where the summation in the numerator is over the states 
with the head at X and the tail at Y or the other way 
around. The number of times that we encounter a state S 
in the Monte Carlo simulation is proportional to Wj,{S). 
Therefore, the above expression can be rewritten as 



{'nfT{X, Y) dXdY 



{Ax,YiS)dXdY)MC 



(A0(5))mc 

where Ax,y{S) dXdY = 1 if and only if one disconti- 
nuity point is in the interval dX centered at X and the 
other in the interval dY centered at Y. The other func- 
tion A0{S) is 1 if there is no worm in S. Now, we obtain 

m) = dXdYTiX, Y)S{R -{X- Y)) 

1 / dX{Ax,X+R{S))MC 



1 



(A0(5))mc 

2 (n(i?))MC, 
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where n{R) is the average number of times the head 
passes, during a cycle, the point whose distance from 
the origin is R. 

3. Numerical Recipe 

In §2, the mathematical framework of several algo- 
rithms have been described. In the present section, we 
present detailed descriptions of the procedures for real- 
izing the algorithms. Since the efSciencies of the algo- 
rithms have been compared only for a very limited num- 
ber of models, there is not much hope of presenting here 
a flawless recommendation as to which algorithm should 
be used for a given instance. However, there are some 
properties that we know already. For example, the loop 
algorithm is the best of all the algorithms, or at least not 
much worse than any other algorithm, in the cases where 
the relationship (such as (2.65)) between the relevant sus- 
ceptibility and the cluster size holds. The XX Z model of 
an c;asy-axis anisotropy with no magnetic field, and the 
bilinear-biquadratic Heisenberg model are among these 
cases. When a finite magnetic field is present, however, 
the cluster size in the loop algorithm does not in general 
refiect the correlation length correctly, and it may per- 
form poorer even than the local updating algorithm. The 
directed-loop algorithm and the worm algorithm work 
much better for such cases. It should be also pointed 
out that applications of these two algorithms are usually 
more straight-forward than that of the loop algorithm; 
for the loop algorithm we first need to find a good graph- 
ical decomposition of the Hamiltonian, which largely de- 
pends on the Hamiltonian under consideration. However, 
the two algorithms become essentially equivalent to the 
local updating algorithm for the models with the Ising- 
like anisotropy, and hence are not very efficient for reduc- 
ing the critical slowing-down in the Ising-like models. 

The description of the procedures in the following sub- 
sections is given mostly in graphical terms, such as sag- 
ments and vertices defined in the first subsection. There- 
fore, we have to translate it into one of the computer 
languages. While the actual coding with a specific com- 
puter language is out of the scope of the present article, 
a remark on the data structure may be helpful here. The 
coding can be done, in principle, with any commonly- 
used computer language. However, the graphical objects 
that we have to deal with are created and annihilated 
during the simulation and their number varies. In addi- 
tion, when we work with the infinite L limit, there is no 
discrete lattice that usually provides us with the index 
system of arrays. Therefore, we feel that some sort of a 
linked-list data structure is necessary, and that this data 
structure naturally fit in the object-oriented program- 
ming. While the object-oriented programming is possi- 
ble with any language, working with object-oriented lan- 
guages such as the C-|— I- and the Java might make the 
programming easier. For example, the above-mentioned 
graphical entities may be most conveniently defined as 
objects, such as a "class" in the C-|— I- language. A seg- 
ment may be defined as an object that has some (or all) 
of the following member variables (i.e., properties): the 
local spin state (see § 3.1 below), the spatial location, the 
beginning time, the ending time, (the pointers to) the 



vertices that delimit the segment, and a variable used 
for cluster identification (see Appendix B). In a sam- 
ple program which may be found in our web site,^^ the 
graphical objects are handled through a container that 
has a built-in linked-list structure and is provided as a 
part of a standard library of the language. 

3. 1 Graphical Terms for World-Line Monte Carlo 

Using the path-integral representation, our Monte 
Carlo simulation can be formulated as a Markov pro- 
cess in the space of graphical objects called world-lines. 
A world- line is a curve on a {d + l)-dimensional space- 
time lattice, where d denotes the real-space dimension. 
The additional dimension, depicted as the vertical di- 
mension throughout the present article, is called the 
imaginary-time dimension. The periodic boundary con- 
dition is imposed in this direction, while an arbitrary 
boundary condition may be used for the other (spatial) 
dimension. The height of the system, i.e., the system size 
in the imaginary-time direction is the inverse tempera- 
ture (3 = 1/k^T. Therefore, each site i in the real space 
is represented by a vertical line of the length (3. Along 
each vertical line, an integral-valued function n{i,T) is 
defined, which is constant (as a function of r) almost ev- 
erywhere and is discontinuous only at kinks. The value 
of the function is called the local state of the space-time 
point {i,T). A point at which the function is discontin- 
uous is called a kink. A spin configuration (or simply a 
configuration) is the set of the functions along all the ver- 
tical lines. In other words, a configuration is equivalent 
to an assignment of integers to all the space-time points. 

In particular, when the local state is binary, say or 
1, and the sum of the variables over the real space is 
conserved in the imaginary-time direction, which is the 
case with the s = 1/2 XX Z model, a configuration can 
be represented by a set of lines that connect the space- 
time points at which the local state is 1. These are the 
world-lines. In Fig. 21, such a world-line configuration 
of a one- dimensional quantum spin model is shown on a 
(1 + l)-dimensional space-time lattice. 

For a quantum spin model of the spin size s, the in- 
tegral variable n{i,T) takes 2s + 1 values from to 2s. 
The eigenstate of the operator +s is customarily used 
as the local spin variable, where S^ is the z component 
of the spin at the site i. We may also call it the number 
of particles, even when we are considering a spin model. 
Although configurations in the case s > 1/2 cannot be 
represented by the world-lines, we still refer to the con- 
figuration as a "world-line configuration" . 

In addition to the local spin variables, we introduce 
auxiliary variables that can be represented conveniently 
by a graph that consists of vertical lines called segments 
and objects called vertices that delimit the segments. A 
vertex is represented by a graph element in the loop algo- 
rithm, whereas it is represented simply by a single hori- 
zontal line in the directed-loop algorithm. An example in 
the case of the loop algorithm is shown in Fig. 21, where 
a vertex is represented by a pair of two horizontal lines. 
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S = world-lines G = loops 

kinks vertices 



- segments 



Fig. 21. Various graphical objects associated witli a world-line 
configuration S and a graph G that match each other. Shown is 
the case of a one-dimensional system. A vertex, in this case, is 
the graph element that consists of a pair of two horizontal lines. 
A kink is located on the vertex at which the local state changes. 
A segment is a part of a vertical line that is delimited by two 
vertices. 



3.2 Loop/cluster Algorithm 

The loop algorithm can be applied to various quan- 
tum spin models and gives much better performance 
than traditional quantum Monte Carlo methods. The 
models that can be handled efficiently with this algo- 
rithm include the XXZ model with no magnetic field, 
the bilinear-biquadratic model, and the SU(A'^) symmet- 
ric model as discussed below. Note, however, that it may 
not be efficient in the cases where terms in the Hamil- 
tonian conflict with each other, such as the case of the 
anti-ferromagnetic XXZ model in an uniform magnetic 
field, fn such cases, one should consider using the worm 
or the directed-loop algorithm discussed in the following 
subsections. 

One cycle in a loop algorithm generally consists of 
the following operations: (i) assigning graphs to a given 
world-line configuration probabilistically; (ii) decompos- 
ing the world-line configuration into loops or clusters de- 
fined by graphs; (iii) assigning new values to integral vari- 
ables on each loop or cluster probabilistically. The types 
of the graphs to bo assigned and the probability of the 
assignment depend on the details of the model. 

In this subsection, we describe the loop algorithm 
in detail for some characteristic models. Discussed in 
the following are (i) the quantum s = 1/2 XYZ spin 
model, ^'^^ (ii) the quantum s > 1 XYZ spin model, 
(iii) the transverse field Ising model, (iv) the quantum 
s = 1 bilinear-biquadratic model,^^ and (v) the quantum 
SU(iV) model.^4 



3.2.1 Quantum s = 1/2 XYZ spin model^'^^ 

First we consider the loop algorithm for the s 
XYZ model 



1/2 



(3.1) 

where erf is an s = 1/2 spin operator in the a direc- 
tion at the ith site, whose eigenvalues are ±1/2, and the 
summation is over all the pairs of interacting spins. In 
what follows, we consider the case, Jx > Jy > 0. (Other 
cases can be transformed to this case if the lattice is a 
bipartite lattice.) Figure 21 shows a world-line configu- 
ration of this model. On the world-line (the thick gray 



lines on the left panel of Fig. 21), the integral variable 
n(i, t) = af + 1/2 takes 1, whereas it takes elsewhere. 

One sweep of the simulation with the loop algorithm 
consists of two phases: the graph assignment and the 
cluster fiip. The graph assignment is done by (i) delet- 
ing the existing graph, (ii) assigning the graph elements 
to the uniform intervals (UI) with a certain density, and 
(iii) connecting the graph elements by vertical lines to 
form loops/clusters. Here, a UI for a pair of sites {ij) 
is an interval in which no change takes place in the lo- 
cal spin state variable n{i,T) or in n{j,T). The cluster 
fiip is done by (i) identifying the clusters, and (ii) flip- 
ping the clusters. The cluster identiflcation is done as 
the assignment of a number to every segment so that 
the number uniquely specifies the cluster to which the 
segment belongs. The flipping of a cluster is simply the 
simultaneous inversion of the local spin states (from 
to 1 or vice versa) for all the segments in the cluster. 
As shown in Table II, we need three types of graph ele- 
ments for the loop algorithm for the XYZ model. The 
graph elements are called cross, horizontal and binding 
from the top to the bottom. Each graph element rep- 
resents a certain constraint on the integral variables at 
four space-time points, namely, legs. The integral vari- 
ables on the points connected by a line belong to the 
same loop (or cluster) and must be changed simultane- 
ously when a loop is flipped. Although the graph ele- 
ments in Table II are drawn as if they had a finite height 
for clarity of the illustration, they have in fact no tempo- 
ral extension. The types of the graph elements as well as 
the density and the probability of the graph assignment 
depend on the anisotropy. There are four cases to be con- 
sidered separately: (i) Jz > Jx > 0; (ii) Jx > Jz > 0; (iii) 

Jz < 0, \Jz\ > Jx; (iv) Jz <0,Jx> \Jz\- 

To be specific, the updating procedure of the integral 
variables n{i,T) is as follows (see also Fig. 22): 

Step 1 Delete the whole graph. 

Step 2 For each pair of interacting sites (ij), do the fol- 
lowing. Decompose the interval (0, /?) into UIs. Then 
for each UI, place graph elements randomly with the 
density given in Table II-V. (See Appendix A for the 
procedure of generating the temporal positions for the 
placement of the graph elements.) 

Step 3 Place a graph element on every kink with the 
probability given in Table II-V. 

Step 4 Draw vertical lines between two graph elements 
and connect two legs (one from each graph element). 
We refer to the resulting graph as G. 

Step 5 Identify clusters. (For the identification algo- 
rithm, see Appendix B.) 

Step 6 For each cluster, fiip the values of the vari- 
ables n{i,T) (i.e., n{i,T) := 1 — n(i,r)) for all 
the segments on it, with the probability p(c) = 
exp{—Bmc)/{exp{Bm.c)+ex.p{—Bmc)). Here rUc is the 
cluster magnetization of the cluster c defined as 



nin 



(3.2) 
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where dX stands for the sum-integral on the cluster 
(the summation with respect to the site index and the 
integration with respect to the time). 

Step 7 Do measurements. (For the estimators, see 
§3.2.6.) 



(After Step 1) 







li — 1 









(After Steps 4 and 5) 



(After Steps 2 and 3) 

Sxl 



(After Steps 6, 7 and 1) 



n 



Fig. 22. The procedures of one step in the loop algorithm for the 
s = 1/2 XYZ model (0 < < Jz)- 



Table III. The same as Table II for the case < Jz < Jx 



.state 

graph^ 




J 



1 



[ I 



j{Jx Jy) 



\{Jx — Jz 




j{Jx Jy) 



T (Jx + Jy) 



h{\M-J^) 



Tabic IV. The same as Table II for the case 0> Jz, Jx < \Jz\- 



Table II. The density of graph elements for uniform intervals (the 
second and the third colunnis), and the probability of choosing 
graph elements for kinks (the forth and the last), in the case 
where < Jx < Jz- The top row shows the local spin states, 
in which a solid (dashed) line denote an up (down) spin. The 
densities and the probabilities for all the other local states can 
be derived easily using the symmetries with respect to the time 
inversion, the exchange of the two sites, and the spin inversion, 
The first column shows graphs. The spins connected by a solid 
(dashed) line must be parallel (anti-parallel). The second and 
the third columns show the density of graph elements. The forth 
and the fifth column shows the probability of choosing the graph 
element on a kink. 



Nutate 
graph\^ 
















4 (Jx + Jy) 





1 





1 
1 




\iJx — Jy) 








1 






^{Jz-Jx) 












Table V. The same as Table II for the case > Jz, \Jz\ Ja 



^\State 
graph\^ 
















- \M) 





Jx~\~ Jy 





,J 

r'' 





j{Jx ~ Jy) 





1 


L 1 
[ 1 





lilM + Jy) 


\Jz\ + Jy 
Jx~\~Jy 






3.2.2 Quantum XYZ spin model with large spins 

For the XYZ model with spins larger than s = 1/2, we 
can in general use the split-spin technique presented be- 
low. The resulting algorithm are generally efficient when 
the corresponding algorithm for s = 1/2 arc efficient. 
However, for models with the easy-plane anisotropy, such 
as the XY model, the coarse-grained algorithm described 
in § 3.4.2 is recommended since there is no need for work- 
ing with multiple split-spins for each site. On the other 
hand, for models with the easy-axis anisotropy with no 
magnetic field, the split-spin algorithm presented below, 
to our knowledge, is the best choice among the algo- 
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1 I 

T I 

cut graph bind graph 

Fig. 23. The graphs for the transverse Ising model. 

rithms discussed in the present article. 

We consider 2s, instead of one, vertical lines at each 
site, each representing a split spin, or a Pauli spin that 
carries s = 1/2. Correspondingly, for each pair of nearest- 
neighbor sites, we apply the same graph-assignment pro- 
cedure as in the s = 1/2 algorithm to each one of (2.s)^ 
combinations of split spins. In addition, in order to elim- 
inate unphysical states, we stochastically assign a graph 
that represents a permutation among the 2s split spins, 
to the end points of the 2s vertical linos. (One exam- 
ple is shown in Fig. 8 for s = 1.) All the permutation 
graphs that match the current state is chosen with equal 
probability. For example, when the / split spins among 
2s are up and the others arc down, there are Z! ways to 
connect up-spins and II {I = 2s — I) for down-spins. There- 
fore, every matching graph is chosen with the probability 

Thus the split-spin algorithm for the XYZ model with 
an arbitrary s can be obtained by replacing Step 2 in 
§3.2.1 by 

Step 2: For each pair of interacting split-spins, 

and (j, 1^) , do the following. Decompose the inter- 
val (0,(3) into UIs. Then for each UI, place graph 
elements randomly with the density given in Table 
II-V. 

and inserting between Step 3 and Step 4 the following 

(Insertion of the permutation graphs): For each 

site, connect 2s end points of vertical lines at 
T = P to those at r = 0, pairwise, so that an 
up-spin is connected to an up-spin and likewise 
for down-spins. (Choose one of (/!)(l!) ways of 
connection with equal probability.) 

3.2.3 Transverse Ising model^^ 

Next we consider the transverse Ising model, 

n=-J^ata]-B^af (3.3) 

with J > and B > 0. Since the transverse field breaks 
the conservation of the total magnetization, the world- 
lines are discontinuous; a world-line can terminate any- 
where. Accordingly, we need a new type of graphs that 
cut segments at a single point. Therefore, we need two 
types of graph elements in this case as shown in Fig. 23. 

Except for the difference in the types of the graph 
elements, the algorithm is almost the same as the one 
for the XYZ model described above. One cycle of the 
loop algorithm for the transverse-field Ising model is as 
follows: 

Step 1 Delete the graph. 

Step 2 For each pair of the interacting sites {ij), do 
the following. Decompose the interval (0, (3) into UIs. 



'tijj' %Jf l....r 

S% f 1, cr"""!::> 

Fig. 24. The graph elements for quantum s = 1 bilineaj:- 
biquadratic model. As before, the spins connected by a solid 
(dashed) line must be parallel (anti-parallel). 

Then for every UI in which the two spins are parallel 
(i.e., the local spin state on i is the same as that on 
j), place one of the graph elements randomly with the 
density J/2. 

Step 3 For each site i (i.e., vertical line), do the follow- 
ing. Place "cut" graphs along the line with the density 

B/2. 

Step 4 Place a "cut" graph on every kink. 

Step 5 Draw vertical lines to connect legs of the graph 
elements. 

Step 6 Identify the clusters. 

Step 7 Flip every cluster independently with probabil- 
ity 1/2. 

Step 8 Do measurements. (For the estimators, see 
§3.2.6.) 

3.2.4 Quantum s = 1 bilinear-biquadratic model^^ 
The Hamiltonian for the model is 

W = - E i-^i^^i ■ + MSi ■ Sjf) . (3.4) 

(y) 

It is convenient to introduce the parameter 9 by 

Ji = -Jcos0, JQ = -Jiimd (J>0). (3.5) 

There is no negative-sign problem when the coupling con- 
stant Jq is positive (— tt < ^ < 0). We consider this case 
in what follows. 

Using the split-spin technique, the original spin is de- 
composed into two s = 1/2 spins. In addition to the or- 
dinary kinks, wc have double kinks at which two particles 
jump to the neighboring site. (This is because of the bi- 
quadratic term.) The biquadratic term also requires new 
types of graph elements that have eight legs as shown 
in Fig. 24 in addition to the ordinary four-legged graph 
elements. (The third and the fourth graph elements in 
Fig. 24 are eight-legged representation of the ordinary 
single graphs, whereas the first and second ones are the 
new ones.) We call the first double-cross and the second 
double-horizontal. 

From an algorithmic point of view, the region — tt < 
^ < is decomposed into three sub-regions: [— tt, — 37r/4], 

37r/4, — 7r/2] and [— 7r/2, 0]. In the first region, we need 
only single-cross and double-cross graphs. In the third re- 
gion, only single-horizontal and double-horizontal graphs 
are required. In the second region, no single graph is used. 
In this region, only double-cross and double-horizontal 
graphs are sufficient for decomposing the Hamiltonian. 
The procedure of a cycle can be summarized as follows. 

Step 1 Delete the graph. 
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Step 2 For each pair of the nearest-neighbor sites (ij), 
do the following. First decompose the interval (0,/3) 
into UIs. Then, for each UI, and for each type of the 
graph elements, do the following. Place graph elements 
on the UI uniform-randomly with the density given in 
Tables VI- VIII. There are generally multiple ways of 
placing a graph element of a given type to a given 
position, i.e., there are multiple ways of wiring so that 
the spins connected by a solid (dashed) line are parallel 
(anti-parallel). Choose one of the consistent ways of 
wiring with equal probability. 

Step 3 Place a graph element on every kink with the 
probability given in Tables VI- VIII. Again choose one 
of the consistent ways of wiring with equal probability. 

Step 4 For each site, connect two end points of vertical 
lines at T = /9 to those at t = 0, pairwisc, so that 
an up-spin is connected to an up-spin and likewise for 
down-spins. Choose one of such ways of wiring with 
equal probability. 

Step 5 Draw vertical lines so that each line connects a 
leg of a graph element to a leg of another. 

Step 6 Identify clusters. 

Step 7 Flip every cluster independently with probabil- 
ity 1/2. 

Step 8 Do measurements. 



Table VI. The density (a) and the probabihty (b) of the graph 
assignment for the bilinear-biquadratic model with — tt < 6 < 
-37r/4. 

(a) 



\state 

graph\^ 












2{Jl - Jq) 


Jl - Jq 





Jl - Jq 






Jq 









(b) 



\state 
graph\^ 




IH: 






H 






1 





X 


■'Q 
Jl 





1 



Wc have described the loop algorithm with the split- 
spin representation, in which all the loops are binary 
loops. In the region — 37r/4 < 6 < — 7r/2, however, we 
can apply the loop algorithm with non-binary loops as 
described below. 

In the algorithm with the non-binary loops, we do not 
need to split spins into 2s Pauli spins. Wc need two types 
of graph elements, cross and horizontal, as we use for the 
s = 1/2 XY model. However, each loop has three states, 
0, 1 and 2, rather than two. Accordingly, the constraint 



Table VII. The same as Table VI for the case -37r/4 < 6> < -7r/2. 



(a) 



\state 
graph\ 
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(b) 
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Table VIII. The same as Table VI for the case — 7r/2 <e <0. 



(a) 
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grapH^ 
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-2Jl 
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Jq 


Jq 






(b) 
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11 11 
ll ll 


1 


-Jl 
Jq-Jl 





IL...J1' 

A i 





Jq 

Jq-Jl 


1 



imposed by the graph elements must be generalized; the 
local spin variables on \\iq two points connected by a 
cross graph must bo equal whereas those connected by 
the horizontal one must be complementary to each other 
(0 and 2, or 1 and 1). To be more specific, the procedure 
of a cycle is as follows: 

Step 1 Delete the graph. 

Step 2 For each pair of the interacting sites (ij), do 
the following. Decompose the interval (0,/3) into UIs. 
Place cross graphs with the density pc = Jl to the 
UIs for which the local spin states on i and j are the 
same. Then, place horizontal ones with the density 
pu = Jq — Jl to the UIs for which the local spin 
states are complementary to each other. 

Step 3 For each kink, place a graph (cross or horizontal) 
that matches the local state. The only local states that 
match both the graph elements are 

\l m j = ( 2 j (2 j • 
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For these kinks, wc choose the cross graph with the 
probabiHty and the horizontal graph with 1 — -j^ . 
For kinks with the other local states, the matching 
graph element is unique. 

Step 4 Draw vertical lines to connect legs of graph ele- 
ments. 

Step 5 Identify the clusters. 

Step 6 Choose one of the three possible states (0,1 or 
2) for each loop with equal probability. 

Step 7 Do measurements. 

3.2.5 Quantum SU(N) model^'^ 

An SU(A'')-invariant generalization of the Heisenberg 
model can be formally written as 

H = Y,H,, = -jY^Y.jfjt- (3-6) 

(ij) (ij) a, (3 

The symbols J"^ (a, /3 = 1, 2, • • • , A^) denote the gener- 
ators of the SU(A'') algebra that satisfy 



(3.7) 



We consider the model with the fundamental representa- 
tion (and its dual representation) where the local Hilbert 

space is iV-dimensional. The matrix elements of the pair 
Hamiltonian Hij can be specifically written as 



{I' ,m'\Hij\l,m) 



-J Sl^rri' Si'^rn {J > 0) 
J Sl,rh Sl'^fh' {J < 0) 



(3.8) 



where l,m,l',m' = 0,1, ■ ■ ■ , N — 1 are the local 'spin' 

variables anologous to the ones used in the algorithms 
for the SU(2) spin models, and [ = (A^ - 1) - I. Then, 
it can be easily shown that the Hamiltonian is expressed 
by a single type of graph elements; a cross graph if J is 
positive (ferromagnetic) or a horizontal graph if J is neg- 
ative (antiferromagnetic). (Two special cases have been 
described in §3.2.4. The s = 1 bilinear-biquadratic mod- 
els a,t = —n/2 and 9 = — 37r/4 are the same as the 
present SU(3) model with J < and J > 0, respec- 
tively.) 

The prescription for the SU(Af) model directly follows 
from this graphical expression: 

Step 1 Delete the graph. 

Step 2 For each pair of the interacting sites (ij), do 
the following. Decompose the interval (0, /3) into UIs. 
Place cross (horizontal) graphs with the density |J| 
to the UIs for which the local spin states on i and j 
are the same (complemantary to each other) , if J > 
(J<0). 

Step 3 For each kink, place a cross (J > 0) or a hori- 
zontal (J < 0) graph. 

Step 4 Draw vertical lines to connect the legs of graph 
elements and form closed loops. 

Step 5 Identify the loops. 

Step 6 Choose one of the N possible states for each 
loop with equal probability. 

Step 7 Do measurements. 



3.2.6 Estimators 

In order to obtain the thermal average of a quantity 
Q, we compute the Monte Carlo averages of the corre- 
sponding estimator E{S,G). In other words, 

(<3)thermal = {E{S,G))mC- 

While the correspondence between a quantity and an es- 
timator is straightforward in many cases, it is not so ob- 
vious in some other cases. In the following, we present a 
list of the estimators of frequently computed quantities. 
Many estimators depends on the world-line configura- 
tion, S, only, whereas improved estimators depend only 
on the graph G. For the derivation, see § 2.16. 

Estimator of a diagonal operator: In this case, 
the estimator is simply 

EQ{s)^Q{m)^{m\Q\m), 

where Q is the diagonal operator and -0(0) is the spin 
configuration at r = 0. Because of the time-translational 
invariance, a better estimator is 

(3.9) 



Eq{S)^^J^ drQ(V(r)). 



For a conserved quantity such as Q = = J^i 

the XX Z quantum spin model, these two estimators are 

identical. In this particular case, the estimator is simply 



Em4S) = J2M0)- 



The time-dependent correlation function of two diagonal 
operators 

TAB{r,r') = {A{r)B{r')) 

can be computed with the estimator 

EMS) = A{i;{T))B{ij{T')). 

By integrating over the imaginary time, we obtain 
an estimator for the generalized susceptibility xab = 

Jo dTTAB{T,0) as 



= p-' (j^^ dTA(V(T)) j U^^ dTB{ij{ 



Energy and Specific Heat: 

total energy, {H), is 



r)) 



(3.10) 

The estimator for the 



En{S) = Sdiag(H)(5) - ^nkink(5), (3.11) 

where nkink('S') is the total number of kinks in S, and 
Edia.g(H){S) is the estimator for the diagonal part of the 
Hamiltonian, which is defined by (3.9) with Q being 
diag(W). The specific heat is not measured with a sin- 
gle estimator. Instead, it is computed using the following 
expression. 



C 



^diag(H))^C 



f3' 

+ (»^kink)MC 



diag(W))]v[C 



(?^kink)MC ~ (»^kink)M(^3-12) 
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Improved operator: For the diagonal magnatic sus- 
ceptibility, 



Jo 



thermal 



at zero field, the following graphical estimator is often 
useful. For the cases where the clusters can be flipped 
with probability 1/2, i.e., if no field breaks the up-down 
symmetry, the estimator is 



(3.13) 



where the summation is over all the clusters in G, and 
Mc is the cluster magnetization 



/ dX^{X) 

J c 



The symbol dX denotes the summation/integration 
over the cluster c. 

Non-diagonal susceptibility: The susceptibility of 
the spin components perpendicular to the quantization 
axis, i.e., 



Xxx 



dT(Af-(r)M-(0))thcrmal 



can be measured for the XXZ spin model with the esti- 
mator 



4/3. ^ 

where Vc is the cluster volume 

Vr.= f dXl. 
J c 



(3.14) 



3.3 Worm Algorithm-"^ 

The loop algorithm often becomes very inefficient 
(worse than the local updating algorithm) when the 
Hamiltonian contains some terms that conflict with each 
other. Such is the case in the anti-ferromagnetic XXZ 
model in a uniform magnetic field. This difficulty can 
be removed by introducing discontinuity points in the 
world-lines. However, the XXZ models with an easy- 
axis anisotropy should not be dealt with the worm al- 
gorithm if the magnetic field is not conflicting with the 
exchange couplings, since the loop algorithm usually per- 
forms much better in such cases especially near the tran- 
sition point. 

A cycle in the worm algorithm consists of a creation 
and an annihilation of a worm, and a vertical move, a 
jump and an anti-jump of the head. (See Fig. 9). A head 
or a tail of the worm is called positive (negative) if the lo- 
cal state above it is greater (smaller) than the local state 
below by one. If the head is positive the tail must be 
negative and vice versa. When the positive one is created 
above the negative one, the worm is called a "lowering" 
worm since the local state between the two discontinu- 
ity points is lower than the original state by one (Fig. 
12). A "raising" worm is the one in which the positive 
discontinuity point is below the negative one. 



In what follows, we describe the worm algorithm for 
the generic Hamiltonian 

W = ^ {Uij + Vij)-ii^Qu (3.15) 

(ii) i 

where Vij = diag (Tiy ) is the diagonal part of the pair 
Hamiltonian, Uij = Hij — Vij is the off-diagonal part, 
and Qi is an operator defined on the site i that has no 
diagonal part. The last term is a source term which is 
included only for a technical purpose. The constant t] is 
anv finite^ ill number of 0{{Ni3y^/^) where N is the 
total number of sites in the whole system. 

The procedure of one cycle, starting from a state with 
no worm, is the following: 

Step 1-1 [Creation (Fig. 9(a))] Choose one of the two 
types (raising or lowering) of the worm with equal 
probability. 

Step 1-2 Choose a point from the whole space-time. (In 
what follows, we consider a continuous part of a ver- 
tical line delimited by kinks in which the site under 
consideration is involved. The head and the tail of 
the worm themselves are also regarded as kinks here. 
We call such a part, a single-site UI.) We denote the 
sincle-site UI in which the chosen point resides by 

I={{i,T)\T€[h,t2]}. 

Step 1-3 Choose two points (i,Ti) and {i,T2) from / so 
that Ti < T2 with the probability density 



PciTl,T2) = 



exp[-AV X (t2 - n) 

A 



(3.16) 



These points are candidates for the positions at 
which a worm may be created. The constant A is 
the normalization factor and is the average ex- 
cess axion per unit time defined as 

rt2 

^create^^)^ (3.17) 

jes{i) 



where d{i) denotes the set of the nearest neighbors 
of i, and AV"^'^^''{t) is the increase in Vij{t) = 

{ilj{t)\Vij\ijj{t)) that occurs if the two discontinuity 
points are created at the bottom and the top of /. 

Step 1-4 Accept the proposed creation of a worm at 
Ti and T2 with the probability min(l, iicreate) where 
-^create is defined as 



Re 



_ 2N(iTfA 



X exp 



dt \ AV 



create 



it) 



x(^'(ti+0)|O,|V''(ti-0)) 

x(V''(t2 + 0)|Q,|V''(t2-0)), (3.18) 

where ^p' is the state after the creation was accepted. 
If the proposal is rejected, go to Step 6. 

Step 1-5 Choose one of the two discontinuities with 
equal probability, and make it the head. 
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Step 2 [Vertical Move (Fig. 9(b))] Move the head along 
the UI where it currently resides. The new time r is 
chosen with the probability density 

exp[-E,e^(^) AV;— (r)] 



^vertical 



B 



where B is the normalization factor and AK" 



(3.19) 

is the increase in the diagonal contribution 
/yj dt Vij (t) that occurs if the vertical move to the 

time T is accepted. 

Step 3 [Jump (Fig. 9(c))] For each nearest-neighbor site 
j of the current site i, consider two intervals; the one 
delimited by the head itself and the nearest kink 
above it, and the other delimited by the head and 
the neaerst kink below it. (Here, we only consider 
kinks in which the sites i and/or j arc involved.) 
For each of the intervals, do Stops 3-1 and 3-2. 

Step 3-1 Generate a time t, which is the candidate for 
the temporal position of placing a kink, with the 
probability density 



exp(-AVji,inp('r)) 
C 



(3.20) 



where C is the normalization factor and AT^ump(T) 
is the increase in the diagonal contribution to the 
total weight that occurs if the kink is created at the 
proposed time r. Specifically, 



AVjump(T) 



AV-fe(t) 



where ti and t2 denote the starting and the ending 
time of the interval, respectively, and AVij (t) is the 
increase in the Vij{t) after the jump was accepted. 

Step 3-2 Let the head jump from i to j at r, creat- 
ing a kink there, with the probability min(l, -Rjump)- 
(Otherwise, do nothing.) Here, i?jump is defined as 



= C(V''(T + 0)|C/,,#'(r-0)) 



{lP'{T^+0)\Qj\lP'{T^-0)) 



(3.21) 



(V'(T^+0)|Qi|V(T^-0)) ' 

where ip' is the state after the jump was accepted 

and T-uj is the head's current temporal position. 

Step 4 [Anti-Jump (Fig. 9(c))] Consider two directions, 
upward and downward. For each nearest-neighbor 
site J, and for each direction, do the following. If 
the kink, involving i and/or j and the nearest to the 
head in the chosen direction, is not the one between 
i and j, do nothing and go to Step 5. Otherwise, we 
consider the second nearest kink among those which 
involves i and/or j, and the interval delimited by it 
and the head itself. Let the head anti-jump, i.e., let 
it jump from j to i so that the kink between i and j is 
annihilated, with the probability min(l, J?anti-jump), 
whore -Ranti-jump is the reciprocal of (3.21) with ij/ 
being the current state, ip the state that the anti- 
jump would result in, and r the temporal position 
of the nearest kink to be erased by the anti-jump. 



Step 5 [Annihilation (Fig. 9(a))] If the head and the 
tail are located on the same site and there are no 
kinks between the two, annihilate the worm with 
the probability min(l, i?,annihiiate) where J?anmhiiate 
is the reciprocal of (3.18) with being the current 
state, tp the state that the annihilation would result 
in, and ti and T2 the temporal positions of the head 
and the tail. Go to Step 6. If annihilation is not 
chosen, go to Step 2. 

Step 6 Do measurements. (The end of the cycle.) 

3.4 Directed- Loop Algorithm 

The directed-loop algorithm can be thought of as a 
generalization of the worm algorithm. Therefore, remarks 
similar to the ones presented at the beginning of the last 
subsection apply to the directed-loop algorithm. As for 
the simulations of XXZ spin model, the directed al- 
gorithm should be used in the cases with the isotropic 
couplings or the easy-plane couplings, with or without 
a magnetic field. For the easy-axis couplings, the loop 
algorithm should be used. 

In the directed-loop algorithm proposed by Syljuasen 
and Sandvik,^ the spin configuration is updated through 
movements of the worm as in the worm algorithm. One 
"sweep" of the directed-loop algorithm consists of an as- 
signment of vertices and a number of cycles of worm up- 
date. Vertices are represented as horizontal lines connect- 
ing nearest neighbor sites. Unlike the worm algorithm, 
the head of the worm has a direction of motion and it 
can only move in this direction. (Fig. 13 shows a local 
configuration in which a worm and vertices are involved.) 
It can change the direction of motion and its spatial lo- 
cation only when it hits a vertex. One cycle of the worm 
update, therefore, consists of the creation of a worm, the 
vertical movements in the direction of motion, the scat- 
terings at vertices, and the annihilation of the worm. 

3.4-1 Directed-Loop Algorithm for the s = 1/2 XXZ 
Model 

In the present subsection, we specialize in the descrip- 
tion of the s = 1/2 XXZ model. 



with J > and h > 0. The whole parameter space is 
divided into six regions as shown in Fig. 25. The constant 
Eq is chosen as Eq = [J — h)s^ + hs for the regions I and 
V, Ef) = —J's^ + hs for the regions II, III, and IV, and 
Eo = J's^ + h{s - 2s2) for the region VI. (While s = 1/2 
in the present case, the same expression can be used for 
the general s case discussed in the next section.) Within 
each region, the scattering probability and the vertex 
density are simple analytic functions of J, J' and h. 

When the head hits a vertex, one of four ways of 
scattering is chosen probabilistically: turn-back, straight 
(vertical), diagonal, and horizontal, as shown in Fig. 10. 
In what follows, the probability for choosing one out of 
these four is denoted as P(i [S), P(T |E), P{/ |S), or 
P(— *■ |E). Here, E is the local state at the vertex before 
the head's arrival (Table IX). 
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Fig. 25. The six regions in the parameter space of the XXZ 
model. 



Table IX. The local states around a vertex and their symbols 
used in Tables Table X-Table XII. 



Symbol E 



State 



( I 



Symbol E 



State 



i+i9 



The whole space-time can be decomposed into a num- 
ber of UFs. (Here, a uniform interval (UI) is defined for 
a pair of nearest neighbor sites, rather than for a single 
site.) Then, one Monte Carlo step in the directed- loop 
algorithm consists of the following operations: 

Step 1 Remove all the vertices with no kink on it. (As 
a result, only the vertices that delimit UIs remain.) 

Step 2 Place vertices in each UI uniform-randomly with 
the density p(S) given in Table X. 

Step 3 Repeat the cycle (Steps 3-1 through 3-4 or 3-4') 

A^cycio times. 

Step 3-1 Choose a point in the whole space-time 
uniform-randomly, and place the head and the tail 
both at the same point. (If the spin state at the 
point is 1 (up), the initial type of the worm is lowe- 
ing. Otherwise it is raising. (See Fig. 12.)) 

Step 3-2 Choose one of the two discontinuities and 

make it the head. Choose the initial direction of mo- 
tion, upward or downward, with equal probability. 

Step 3-3 Let the head go until it hits a vertex or comes 
back to the original position where the tail stays. If 
it hits a vertex, go to Step 3-4. If it hits the tail, go 
to Step 3-4'. 

Step 3-4 Choose the scattering direction T with the 

probabihty P(r|E) in Table X. (The type of the 
head, positive or negative, is not changed by the 
scattering.) Then, go back to Step 3-3. 



Step 3-4' Let the worm annihilate. (The end of one cy- 
cle.) 

Step 4 Do measurments. (The end of one Monte Carlo 
step.) 

The number of the cycles Ncyde in one Monte Carlo step 
is an arbitrary fixed number. It is usually set so that 
every space-time point is visited once on average during 
a Monte Carlo step. (A sample program based on this 
procedure may be found at a web site.^^) 

3.4-2 Directed-Loop Algorithm for s > 1/2 Models 
(Coarse- Grained Algorithm) 
As stated in §2.10, the head-scattering probability in 
the directed-loop algorithm is not uniquely determined 
by the detailed balance condition. In § 2.11, we have ex- 
plained how we can use a solution to (2.43) for ,s = 1/2 
to obtain a solution for larger spins. ^'^ Since the whole 
parameter space (i.e., J-J'-H space) is devided in the 
six regions in the solution for s = 1/2 (Fig. 25), the 
same devision applies to the larger spins. It should be 
also noted that the turning-back probability in Region 
I is always zero for any s. The directed-loop algorithm 
for large spins is different from that for s = 1/2 in the 
creation and the annihilation of the worm. Otherwise, 
the procedure is the same as the one in the s = 1/2 case 
described in §3.4.1. Therefore, one Monte Carlo step of 
the coarse-grained algorithm for spin s > 1/2 can be ob- 
tained by replacing the Table X by Table XI, and Step 
3-1 and Step 3-4' by the following operations: 

Step 3-1 Choose a point in the whole space-time 
uniform-randomly, and create a worm there. Then, 
choose the initial type of the worm, raising or low- 
ering, with the probability I /{2s) or I /{2s), respec- 
tively. Here / = 0, 1, • • • , 2s is the spin-state variable 
at the chosen point, and I = 2s — I. 

and 

Step 3-4' Let the head annihilate with the tail, or let 
it go through. The probability of the annihilation 
depends on the type of the worm and the local spin 
state I between the head and the tail just before 
the collision. If the worm is of the raising type, the 
annihilation probability is while it is {l)~^ oth- 
erwise. If the head goes through, go back to Step 
3-3. If the worm is annihilated, go to Step 4. 

3.4-3 Soft-Core Boson Model with Repulsive Interac- 
tions 

It was pointed out that the algorithm for general s 
presented above can be used for bosonic models by taking 
the s — *■ 00 limit. ''^ Here, we consider the tight-binding 
soft-core boson model (or the boson Hubbard model) 



n 



E 

(ij) 



{blbj 



b4) 



E 



lirii 



—ni{ni - I) 



(3.22) 



where hi and h\ are an annihilation and a creation oper- 
ator, respectively, on the site i, and Ui = h\hi. We here 
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Table X. The directed-loop algorithm for the quantum s = 1/2 XX Z spin models. The density of vertices p, and the scattering 
probabilities of the head P(r|S) arc shown. The latter arc for presented only in the case where the head is entering the vertex from 
below along the left line. The scattering probabilities for the other cases can be obtained by symmetry transformations. The probability 
of going through (F =1") is simply equal to 1— [the probabilities of the three proper scatterings]. The symbol h denotes the magnetic 
field per pair of spins, which is related to the magnetic field per original spin, H,hy h = H/d for the d-dimensional hyper cubic lattice. 
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Table XI. The coarse-grained algorithm for the quantum XXZ spin model with arbitrary s. The density of vertices, p, and the scattering 
probabilities of the head P(F|S) are shown. The symbol h denotes the magnetic field per pair of Pauli spins, which is related to the 
magnetic field per original spin, H, hy h = Hj{2ds) for the d-dimensional hyper cubic lattice. (J=2s — l,m = 2s — m.) 
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assume that Vq > and Vi > 0. The model can be 
considered as a spin model with s = oo. In the present 
algorithm, the state is updated in much the same way as 
the worm algorithm. A cycle consists of (i) the creation 
of a worm, (ii) the movements of the head, and (iii) the 
annihilation of the worm. We do not use vertices explic- 
itly. Instead, "scattering" points are dynamically created 
with some density ahead of the head, and it is scattered 
at the nearest scattering point. One cycle of the coarse- 
grained algorithm for the soft-core boson model can be 
stated as follows: 

Step 1 Choose a point uniform-randomly in the whole 
space-time, and create there a worm. (One is the 
head and the other is the tail.) The initial type of 
the worm is raising (Fig. 12). 

Step 2 Choose the initial direction of motion, upward 
or downward, with equal probability. 

Step 3 For each nearest-neighbor site j to the current 
site i, consider the UI between i and j ahead of the 
head. For each one of the four scattering directions 
r, generate a time Tj,r in the UI. Here Tj,r is the 
closest to the current temporal position of the head 
among those generated with the density given in Ta- 
ble XII. When this is done for all nearest neighbors j 
and all directions F, choose from the Tj,rs the closest 
to the current temporal position. (Let it be Tj^^ro-) 

Step 4 Compare the nearest scattering point generated 
in Step 3, the nearest kink ahead of the head, and 
the tail if it resides on the same site as the head. 
If the nearest among these three is the scattering 
point, go to Step 5. If it is the kink, go to Step 5'. If 
it is the tail, go to Step 6. 

Step 5 Let the head scatter as specified by jo and Fq. 

Go to Step 3. 

Step 5' Choose the direction of the scattering with the 
probability P(F|S) given in Table XII, and let the 
head scatter. Go to Step 3. 

Step 6 If the head is negative and comes back to the 
tail from below, or if it is positive and comes back 
from above, let it pass the tail with the probability 
1 and go to Step 3. Otherwise, let it pass with the 
probability 1 — 1/n where n is the number of the 
particles ahead of the head just before the collision. 
Go to Step 3. 

Step 7 Let the worm annihilate. 

Step 8 Do measurements. 

3.4-4 Observables 

In the directed-loop algorithm (and also in the algo- 
rithm with the series-expansion representation), the en- 
ergy and the specific heat can simply be computed by 
counting the number of the vertices, 

(H) = -/3-i(nv) 

and 

C=(nv2)-(nv)'-(nv). 

The off-diagonal Green's function r(r,T) is, as in the 
worm algorithm,^ proportional to the frequency of the 



Author Name 

occurrence of the configurations in which the head is sep- 
arated from the tail by the space-time vector x = (r, r). 
Specifically, Green's function of the bosonic models, for 
which the source term is proportional to b + b^, can be 
expressed as 

{bix)bHy)) = Hx - y)) (3.23) 

where A'' is the number of sites and n{x) is the number 
of times the configurations with the head and the tail 
separated from each other by x appear during one cycle 
of the update (i.e., from the creation through the anni- 
hilation of the worm) . See Dorneich and Troyer^^ for the 
measurement of the time-dependent Green's function in 
the series-expansion representation with finite L. 

4. Summary and Future Problems 

We have reviewed recent developments in the Monte 
Carlo simulation methods based on Markov processes in 
the space of the world-line configurations. A few orig- 
inal results are also included; the non-binary algorithm 
for the bilinear-biquadratic model (§ 2.7, § 3.2.4) with the 
SU(2) symmetry, the worm-scattering probability of the 
coarse-grained algorithm for the boson Hubbard model 
(§3.4.3), and the continuous-time formulation of the ex- 
tended ensemble method (§2.15). Wo have focused on 
the three methods of updating the world-line configura- 
tions; the loop algorithm, the worm algorithm, and the 
directed-loop algorithm. The detailed descriptions of the 
numerical procedures have been given. The methods de- 
scribed solve, to a large extent, the problems in the local 
updating algorithm. The solved (or eased) problems in- 
clude (i) the slowing-down near the critical point or the 
zero temperature, (ii) the discretization slowing-down 
(the Wiesler freezing), (iii) the systematic error due to 
the discretization, (iv) the non-ergodicity due to the ar- 
tificially conserved quantities (or the additional slowing- 
down due to the ad hoc global flips introduced for orgod- 
icity), (v) the computation of off-diagonal quantities, (vi) 
the freezing due to the external magnetic field competing 
against the exchange couplings, and (vii) the negative- 
sign problem in a special case of spinless fermions. On the 
other hand, a number of problems remain to be solved. 
Particidarly important among them are (i) the general 
fermion sign problem, (ii) the general frustration sign 
problem, (iii) the models with interaction terms compet- 
ing against cac;li other, such as the spin model with a 
strong anisotropy (uniaxial, cubic, tetragonal, etc), and 
(iv) the models for which the order parameter cannot be 
expressed as a simple sum of local operators. The nature 
of the negative-sign problem may be quite different, de- 
pending on its origin. Whether it is due to the fermion 
sign or the frustration, however, there is not even a clue 
to a general solution. It is possible that a general solu- 
tion does not even exist. An example of the problems (iii) 
and (iv) can be found in the s > 1 Heiscnberg models 
with the cubic anisotropy. While such an anisotropy is 
quite common in real materials, we are not aware of any 
efficient algorithms. For example, the model 
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Table XII. The densities of scattering points /9(r|S) and the scattering probability at kinks P(r|E) for a local state S. The densities 
are shown only in the case where the head is moving upward. (The densities in the other case can be obtained simply by changing the 
sign of the head.) The F specifies the direction of the head after the scattering. The z denotes the coordination number, e.g., z = 2d 
for the d-dimensional hyper-cubic lattice. 
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with positive J and D does not cause the negative-sign 
problem when the ordinary 5^ representation basis is 
used. In addition, it is easy to find a graphical decom- 
position such as (2.19) for the model. However, the 
auto-correlation time of the simulation is extremely long 
at low temperatures and the configuration is practically 
frozen. A similar situation can be seen in another model 
with the cubic anisotropy 

= —J ^ ^ Si ■ Sj 

Again this model does not cause the negative sign prob- 
lem when the S*^ representation basis is used. How- 
ever, when we use a representation basis in which the 
model's symmetry is manifest, i.e., when the three vec- 
tors |±) = I t ) ± I I) and |0) arc used as the basis set, the 
model exhibits negative signs while the computational 
auto-correlation time is small in this basis. The Ising 
spin-glass problems may be considered as another exam- 
ple of the problem (iii). It is well-known that the auto- 
correlation time of this model is so long that an accurate 
numerical simulation is extremely difficult near and be- 
low the critical temperature. However, when the 5*^ rep- 
resentation basis is used, it is easy to find a simulation 
method with a small auto-correlation time, although the 
negative-sigh problem appears in the representation 
basis. These facts appear to suggest that the difficulty of 
the problems (iii) and (iv) are closely related to (ii) in 
general. 
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Appendix A: Generation of Temporal Positions 
in Graph (or Vertex) Assignment 

If a stochastic process consists of a time series of events 

with a given density and if each event occurs indepen- 
dently, the process is a Poisson process. (Specifically in 
the present context, these events correspond to graph- 
elements, vertices or scattering points.) In a Poisson pro- 
cess, a time interval between successive events obeys the 
exponential distribution. When the density of events is 
n, the distribution of the intervals is 

piAt) = [ [^;>0) . (A.l) 

We can generate a random variable At that obeys this 
distribution from the uniform random variable r G (0, 1] 



by using the transformation 

At=-'^. (A.2) 

n 

Therefore, placing events (objects) on a given time- 
window (segment) with a given density n can be done 
as follows: 

Step 1 Set the time variable t to be the starting time 
of the window. 

Step 2 Generate a uniform random number r G (0, 1] 

and compute At using (A-2). 

Step 3 Increase the time t by At, i.e., t := t + At. 

Step 4 If the new time t is smaller than the ending time 
of the window, place an object at t, and go back to 
Step 2. Otherwise, terminate the process. 

Appendix B: Cluster Identification 

In an actual computer program, the world-line con- 
figuration is represented as a linked-list data structure 
of objects, where each object corresponds to a segment. 
In order to identify loops and clusters, we define a vari- 
able, which will eventually be the cluster ID number, for 
each object. (In the C-|— I- language, for example, we add 
a member variable to the "segment" object.) When a 
graph element is assigned and points are connected, the 
variables are updated as follows. 

Let c(s) be the variable of the object that is specified 
(or is pointed to) by an index (or by a pointer) s. In 
what follows, we identify an index with the object that 
is specified by it. The variables define a tree structure. 
That is, c(s) is a parent of s, and c(s) = s if it does 
not have a parent. Every object initially has no parent. 
When two segments, ,s and .s', arc connected by an edge 
in a graph element, the following operations are applied: 

Step 1 Find the root of each object. Let r and r' be the 

roots of s and s', respectively. This can be done by 
repeating r := c(r) starting from r := s until r = c(r) 
holds. The same for r'. 

Step 2 Let R := min(r,r'). Then, let c(a) := R for all 
a, where a are the ancestors of s and s' . 

After these procedures have been done for all connec- 
tions, c(s) is the unique identifier of the cluster, i.e., 
c(,s) = c(s') if and only if s and s' belong to the same 
cluster. 

Choosing the root which has more children than the 
other in Step 2, we can make this procedure more ef- 
ficient. It can be done by using the new variable n(r) 
which holds the number of children of a root r. Starting 
from all n{s) = 1, we only need to update the variable 
as n{R) := n{r) + n{r') in Step 2. 
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